diff --git a/assets/images/consume/export-targets/export-target-health-check.png b/assets/images/consume/export-targets/export-target-health-check.png new file mode 100644 index 00000000..eaf3e413 Binary files /dev/null and b/assets/images/consume/export-targets/export-target-health-check.png differ diff --git a/assets/images/consume/export-targets/golden-record-streams.png b/assets/images/consume/export-targets/golden-record-streams.png new file mode 100644 index 00000000..54d6c536 Binary files /dev/null and b/assets/images/consume/export-targets/golden-record-streams.png differ diff --git a/assets/images/consume/export-targets/stream-log.png b/assets/images/consume/export-targets/stream-log.png new file mode 100644 index 00000000..6b55fd58 Binary files /dev/null and b/assets/images/consume/export-targets/stream-log.png differ diff --git a/assets/images/getting-started/data-cleaning/create-a-clean-project-1.png b/assets/images/getting-started/data-cleaning/create-a-clean-project-1.png index 56f8e6cd..f0307fab 100644 Binary files a/assets/images/getting-started/data-cleaning/create-a-clean-project-1.png and b/assets/images/getting-started/data-cleaning/create-a-clean-project-1.png differ diff --git a/assets/images/getting-started/data-cleaning/create-a-clean-project-2.png b/assets/images/getting-started/data-cleaning/create-a-clean-project-2.png index 824c914f..543d3550 100644 Binary files a/assets/images/getting-started/data-cleaning/create-a-clean-project-2.png and b/assets/images/getting-started/data-cleaning/create-a-clean-project-2.png differ diff --git a/assets/images/getting-started/data-cleaning/create-a-clean-project-3.png b/assets/images/getting-started/data-cleaning/create-a-clean-project-3.png index 4b350a3a..1bb112e0 100644 Binary files a/assets/images/getting-started/data-cleaning/create-a-clean-project-3.png and b/assets/images/getting-started/data-cleaning/create-a-clean-project-3.png differ diff --git a/assets/images/getting-started/data-cleaning/find-data-1.png b/assets/images/getting-started/data-cleaning/find-data-1.png index ffd4f1fe..0b373f2e 100644 Binary files a/assets/images/getting-started/data-cleaning/find-data-1.png and b/assets/images/getting-started/data-cleaning/find-data-1.png differ diff --git a/assets/images/getting-started/data-cleaning/find-data-2.png b/assets/images/getting-started/data-cleaning/find-data-2.png index ff20b81c..8334f28f 100644 Binary files a/assets/images/getting-started/data-cleaning/find-data-2.png and b/assets/images/getting-started/data-cleaning/find-data-2.png differ diff --git a/assets/images/getting-started/data-cleaning/find-data-3.png b/assets/images/getting-started/data-cleaning/find-data-3.png index 8645cfe1..c51e6fab 100644 Binary files a/assets/images/getting-started/data-cleaning/find-data-3.png and b/assets/images/getting-started/data-cleaning/find-data-3.png differ diff --git a/assets/images/getting-started/data-cleaning/find-data-4.png b/assets/images/getting-started/data-cleaning/find-data-4.png index 85d3c574..291bc2d4 100644 Binary files a/assets/images/getting-started/data-cleaning/find-data-4.png and b/assets/images/getting-started/data-cleaning/find-data-4.png differ diff --git a/assets/images/getting-started/data-cleaning/modify-data-3.png b/assets/images/getting-started/data-cleaning/modify-data-3.png index 0cb289df..fdd0e60e 100644 Binary files a/assets/images/getting-started/data-cleaning/modify-data-3.png and b/assets/images/getting-started/data-cleaning/modify-data-3.png differ diff --git a/assets/images/getting-started/data-ingestion/create-mapping-3.png b/assets/images/getting-started/data-ingestion/create-mapping-3.png index 08b8df9d..aa0d146c 100644 Binary files a/assets/images/getting-started/data-ingestion/create-mapping-3.png and b/assets/images/getting-started/data-ingestion/create-mapping-3.png differ diff --git a/assets/images/getting-started/data-ingestion/create-mapping-5.png b/assets/images/getting-started/data-ingestion/create-mapping-5.png index 69dd0578..47a0e641 100644 Binary files a/assets/images/getting-started/data-ingestion/create-mapping-5.png and b/assets/images/getting-started/data-ingestion/create-mapping-5.png differ diff --git a/assets/images/getting-started/data-ingestion/create-mapping-6.png b/assets/images/getting-started/data-ingestion/create-mapping-6.png index 9cc2297b..16db1eb9 100644 Binary files a/assets/images/getting-started/data-ingestion/create-mapping-6.png and b/assets/images/getting-started/data-ingestion/create-mapping-6.png differ diff --git a/assets/images/getting-started/data-ingestion/create-mapping-7.png b/assets/images/getting-started/data-ingestion/create-mapping-7.png index d85a2a36..886be8bc 100644 Binary files a/assets/images/getting-started/data-ingestion/create-mapping-7.png and b/assets/images/getting-started/data-ingestion/create-mapping-7.png differ diff --git a/assets/images/getting-started/data-ingestion/files-add.png b/assets/images/getting-started/data-ingestion/files-add.png new file mode 100644 index 00000000..de082311 Binary files /dev/null and b/assets/images/getting-started/data-ingestion/files-add.png differ diff --git a/assets/images/getting-started/data-ingestion/golden-records-add-records.png b/assets/images/getting-started/data-ingestion/golden-records-add-records.png new file mode 100644 index 00000000..5ee02b75 Binary files /dev/null and b/assets/images/getting-started/data-ingestion/golden-records-add-records.png differ diff --git a/assets/images/getting-started/data-ingestion/process-data-1.png b/assets/images/getting-started/data-ingestion/process-data-1.png index 5e562598..b8a70b39 100644 Binary files a/assets/images/getting-started/data-ingestion/process-data-1.png and b/assets/images/getting-started/data-ingestion/process-data-1.png differ diff --git a/assets/images/getting-started/data-ingestion/process-data-2.png b/assets/images/getting-started/data-ingestion/process-data-2.png index 3eca640a..6e75f0b5 100644 Binary files a/assets/images/getting-started/data-ingestion/process-data-2.png and b/assets/images/getting-started/data-ingestion/process-data-2.png differ diff --git a/assets/images/getting-started/data-ingestion/search-for-data-1.png b/assets/images/getting-started/data-ingestion/search-for-data-1.png index d873eae2..b99f625f 100644 Binary files a/assets/images/getting-started/data-ingestion/search-for-data-1.png and b/assets/images/getting-started/data-ingestion/search-for-data-1.png differ diff --git a/assets/images/getting-started/data-ingestion/search-for-data-2.png b/assets/images/getting-started/data-ingestion/search-for-data-2.png index e7894359..6c7fe71a 100644 Binary files a/assets/images/getting-started/data-ingestion/search-for-data-2.png and b/assets/images/getting-started/data-ingestion/search-for-data-2.png differ diff --git a/assets/images/getting-started/data-ingestion/upload-data-from-file-add.png b/assets/images/getting-started/data-ingestion/upload-data-from-file-add.png new file mode 100644 index 00000000..fe7748c9 Binary files /dev/null and b/assets/images/getting-started/data-ingestion/upload-data-from-file-add.png differ diff --git a/assets/images/getting-started/data-ingestion/view-imported-data-1.png b/assets/images/getting-started/data-ingestion/view-imported-data-1.png index ebed6aab..649151f1 100644 Binary files a/assets/images/getting-started/data-ingestion/view-imported-data-1.png and b/assets/images/getting-started/data-ingestion/view-imported-data-1.png differ diff --git a/assets/images/getting-started/data-streaming/add-export-target-1.png b/assets/images/getting-started/data-streaming/add-export-target-1.png index c3e98f17..c1eb9fd0 100644 Binary files a/assets/images/getting-started/data-streaming/add-export-target-1.png and b/assets/images/getting-started/data-streaming/add-export-target-1.png differ diff --git a/assets/images/getting-started/data-streaming/add-export-target-2.png b/assets/images/getting-started/data-streaming/add-export-target-2.png index 7885bf42..f034b424 100644 Binary files a/assets/images/getting-started/data-streaming/add-export-target-2.png and b/assets/images/getting-started/data-streaming/add-export-target-2.png differ diff --git a/assets/images/getting-started/data-streaming/add-export-target-3.png b/assets/images/getting-started/data-streaming/add-export-target-3.png index 9aed4efb..a30ac26a 100644 Binary files a/assets/images/getting-started/data-streaming/add-export-target-3.png and b/assets/images/getting-started/data-streaming/add-export-target-3.png differ diff --git a/assets/images/getting-started/data-streaming/create-stream-2.png b/assets/images/getting-started/data-streaming/create-stream-2.png index a5edd9e2..0e57c071 100644 Binary files a/assets/images/getting-started/data-streaming/create-stream-2.png and b/assets/images/getting-started/data-streaming/create-stream-2.png differ diff --git a/assets/images/getting-started/deduplication/dedup-2.png b/assets/images/getting-started/deduplication/dedup-2.png index f8602003..5e958ada 100644 Binary files a/assets/images/getting-started/deduplication/dedup-2.png and b/assets/images/getting-started/deduplication/dedup-2.png differ diff --git a/assets/images/getting-started/glossary/create-term-1.png b/assets/images/getting-started/glossary/create-term-1.png index 9d2666e1..15e6de34 100644 Binary files a/assets/images/getting-started/glossary/create-term-1.png and b/assets/images/getting-started/glossary/create-term-1.png differ diff --git a/assets/images/getting-started/glossary/create-term-4.png b/assets/images/getting-started/glossary/create-term-4.png index 6082cef5..7c0bb9c3 100644 Binary files a/assets/images/getting-started/glossary/create-term-4.png and b/assets/images/getting-started/glossary/create-term-4.png differ diff --git a/assets/images/getting-started/glossary/manage-glossary-1.png b/assets/images/getting-started/glossary/manage-glossary-1.png index 367236b1..e6b693b9 100644 Binary files a/assets/images/getting-started/glossary/manage-glossary-1.png and b/assets/images/getting-started/glossary/manage-glossary-1.png differ diff --git a/assets/images/getting-started/glossary/manage-glossary-2.png b/assets/images/getting-started/glossary/manage-glossary-2.png index 3664bf41..1eecbe0f 100644 Binary files a/assets/images/getting-started/glossary/manage-glossary-2.png and b/assets/images/getting-started/glossary/manage-glossary-2.png differ diff --git a/assets/images/getting-started/hierarchy-builder/create-hierarchy-1.png b/assets/images/getting-started/hierarchy-builder/create-hierarchy-1.png index f8a03a3e..75f020e9 100644 Binary files a/assets/images/getting-started/hierarchy-builder/create-hierarchy-1.png and b/assets/images/getting-started/hierarchy-builder/create-hierarchy-1.png differ diff --git a/assets/images/getting-started/hierarchy-builder/create-hierarchy-2.png b/assets/images/getting-started/hierarchy-builder/create-hierarchy-2.png new file mode 100644 index 00000000..8f55e394 Binary files /dev/null and b/assets/images/getting-started/hierarchy-builder/create-hierarchy-2.png differ diff --git a/assets/images/getting-started/hierarchy-builder/update-stream-1.png b/assets/images/getting-started/hierarchy-builder/update-stream-1.png index 1a91af61..339eccd0 100644 Binary files a/assets/images/getting-started/hierarchy-builder/update-stream-1.png and b/assets/images/getting-started/hierarchy-builder/update-stream-1.png differ diff --git a/assets/images/getting-started/relations/add-edge-1.png b/assets/images/getting-started/relations/add-edge-1.png index 9417f93f..338f0ad6 100644 Binary files a/assets/images/getting-started/relations/add-edge-1.png and b/assets/images/getting-started/relations/add-edge-1.png differ diff --git a/assets/images/getting-started/relations/entity-mapping-2.png b/assets/images/getting-started/relations/entity-mapping-2.png index c35ba136..27c696a7 100644 Binary files a/assets/images/getting-started/relations/entity-mapping-2.png and b/assets/images/getting-started/relations/entity-mapping-2.png differ diff --git a/assets/images/getting-started/relations/view-relations-1.png b/assets/images/getting-started/relations/view-relations-1.png index 85a2d207..530d6718 100644 Binary files a/assets/images/getting-started/relations/view-relations-1.png and b/assets/images/getting-started/relations/view-relations-1.png differ diff --git a/assets/images/getting-started/relations/view-relations-2.png b/assets/images/getting-started/relations/view-relations-2.png index 8d97614b..e41a25e2 100644 Binary files a/assets/images/getting-started/relations/view-relations-2.png and b/assets/images/getting-started/relations/view-relations-2.png differ diff --git a/assets/images/getting-started/rule-builder/rule-builder-2.png b/assets/images/getting-started/rule-builder/rule-builder-2.png index e149acf9..364c130e 100644 Binary files a/assets/images/getting-started/rule-builder/rule-builder-2.png and b/assets/images/getting-started/rule-builder/rule-builder-2.png differ diff --git a/assets/images/getting-started/rule-builder/rule-builder-5.png b/assets/images/getting-started/rule-builder/rule-builder-5.png index 9b31b51c..d1f35da5 100644 Binary files a/assets/images/getting-started/rule-builder/rule-builder-5.png and b/assets/images/getting-started/rule-builder/rule-builder-5.png differ diff --git a/assets/images/getting-started/rule-builder/rule-builder-6.png b/assets/images/getting-started/rule-builder/rule-builder-6.png index 9b60c60d..189f171f 100644 Binary files a/assets/images/getting-started/rule-builder/rule-builder-6.png and b/assets/images/getting-started/rule-builder/rule-builder-6.png differ diff --git a/assets/images/getting-started/rule-builder/rule-builder-7.png b/assets/images/getting-started/rule-builder/rule-builder-7.png index c6011bbe..413e9cf5 100644 Binary files a/assets/images/getting-started/rule-builder/rule-builder-7.png and b/assets/images/getting-started/rule-builder/rule-builder-7.png differ diff --git a/assets/images/getting-started/rule-builder/rule-builder-8.png b/assets/images/getting-started/rule-builder/rule-builder-8.png index 37ef62bd..001be663 100644 Binary files a/assets/images/getting-started/rule-builder/rule-builder-8.png and b/assets/images/getting-started/rule-builder/rule-builder-8.png differ diff --git a/assets/images/getting-started/rule-builder/rule-builder-9.png b/assets/images/getting-started/rule-builder/rule-builder-9.png index 0461a1b6..97ecc640 100644 Binary files a/assets/images/getting-started/rule-builder/rule-builder-9.png and b/assets/images/getting-started/rule-builder/rule-builder-9.png differ diff --git a/assets/images/integration/additional-operations/access-validations.png b/assets/images/integration/additional-operations/access-validations.png new file mode 100644 index 00000000..b125c8a1 Binary files /dev/null and b/assets/images/integration/additional-operations/access-validations.png differ diff --git a/assets/images/integration/additional-operations/add-column-configuration.png b/assets/images/integration/additional-operations/add-column-configuration.png new file mode 100644 index 00000000..51c65fad Binary files /dev/null and b/assets/images/integration/additional-operations/add-column-configuration.png differ diff --git a/assets/images/integration/additional-operations/add-column-type.png b/assets/images/integration/additional-operations/add-column-type.png new file mode 100644 index 00000000..68acdba0 Binary files /dev/null and b/assets/images/integration/additional-operations/add-column-type.png differ diff --git a/assets/images/integration/additional-operations/clear-records.png b/assets/images/integration/additional-operations/clear-records.png new file mode 100644 index 00000000..b9d89fa0 Binary files /dev/null and b/assets/images/integration/additional-operations/clear-records.png differ diff --git a/assets/images/integration/additional-operations/download-original-file.png b/assets/images/integration/additional-operations/download-original-file.png new file mode 100644 index 00000000..ea4dd14f Binary files /dev/null and b/assets/images/integration/additional-operations/download-original-file.png differ diff --git a/assets/images/integration/additional-operations/duplicates-preview.png b/assets/images/integration/additional-operations/duplicates-preview.png new file mode 100644 index 00000000..e148ce84 Binary files /dev/null and b/assets/images/integration/additional-operations/duplicates-preview.png differ diff --git a/assets/images/integration/additional-operations/edit-values-manually-operations.png b/assets/images/integration/additional-operations/edit-values-manually-operations.png new file mode 100644 index 00000000..d34aa257 Binary files /dev/null and b/assets/images/integration/additional-operations/edit-values-manually-operations.png differ diff --git a/assets/images/integration/additional-operations/edit-values-manually.png b/assets/images/integration/additional-operations/edit-values-manually.png new file mode 100644 index 00000000..1e6dc6ca Binary files /dev/null and b/assets/images/integration/additional-operations/edit-values-manually.png differ diff --git a/assets/images/integration/additional-operations/filter-options.png b/assets/images/integration/additional-operations/filter-options.png new file mode 100644 index 00000000..5426e99d Binary files /dev/null and b/assets/images/integration/additional-operations/filter-options.png differ diff --git a/assets/images/integration/additional-operations/filters-pane.png b/assets/images/integration/additional-operations/filters-pane.png new file mode 100644 index 00000000..fe707d2c Binary files /dev/null and b/assets/images/integration/additional-operations/filters-pane.png differ diff --git a/assets/images/integration/additional-operations/monitoring-data-set-queues.png b/assets/images/integration/additional-operations/monitoring-data-set-queues.png new file mode 100644 index 00000000..b1f68720 Binary files /dev/null and b/assets/images/integration/additional-operations/monitoring-data-set-queues.png differ diff --git a/assets/images/integration/additional-operations/monitoring-error.png b/assets/images/integration/additional-operations/monitoring-error.png new file mode 100644 index 00000000..29d09e65 Binary files /dev/null and b/assets/images/integration/additional-operations/monitoring-error.png differ diff --git a/assets/images/integration/additional-operations/monitoring-global-queues.png b/assets/images/integration/additional-operations/monitoring-global-queues.png new file mode 100644 index 00000000..53afad42 Binary files /dev/null and b/assets/images/integration/additional-operations/monitoring-global-queues.png differ diff --git a/assets/images/integration/additional-operations/monitoring-hourly-ingestion-reports.png b/assets/images/integration/additional-operations/monitoring-hourly-ingestion-reports.png new file mode 100644 index 00000000..a15d53f1 Binary files /dev/null and b/assets/images/integration/additional-operations/monitoring-hourly-ingestion-reports.png differ diff --git a/assets/images/integration/additional-operations/monitoring-postman-receipt-id.png b/assets/images/integration/additional-operations/monitoring-postman-receipt-id.png new file mode 100644 index 00000000..a75a0c00 Binary files /dev/null and b/assets/images/integration/additional-operations/monitoring-postman-receipt-id.png differ diff --git a/assets/images/integration/additional-operations/monitoring-receipt-id-ingested-records.png b/assets/images/integration/additional-operations/monitoring-receipt-id-ingested-records.png new file mode 100644 index 00000000..fcebfc6e Binary files /dev/null and b/assets/images/integration/additional-operations/monitoring-receipt-id-ingested-records.png differ diff --git a/assets/images/integration/additional-operations/monitoring-receipt-id-produced-golden-records.png b/assets/images/integration/additional-operations/monitoring-receipt-id-produced-golden-records.png new file mode 100644 index 00000000..3f2d2f94 Binary files /dev/null and b/assets/images/integration/additional-operations/monitoring-receipt-id-produced-golden-records.png differ diff --git a/assets/images/integration/additional-operations/monitoring-remove-records.png b/assets/images/integration/additional-operations/monitoring-remove-records.png new file mode 100644 index 00000000..81760036 Binary files /dev/null and b/assets/images/integration/additional-operations/monitoring-remove-records.png differ diff --git a/assets/images/integration/additional-operations/monitoring-search-receipt-id.png b/assets/images/integration/additional-operations/monitoring-search-receipt-id.png new file mode 100644 index 00000000..4e1b4f8c Binary files /dev/null and b/assets/images/integration/additional-operations/monitoring-search-receipt-id.png differ diff --git a/assets/images/integration/additional-operations/monitoring-total.png b/assets/images/integration/additional-operations/monitoring-total.png new file mode 100644 index 00000000..243aaa9f Binary files /dev/null and b/assets/images/integration/additional-operations/monitoring-total.png differ diff --git a/assets/images/integration/additional-operations/operations-options.png b/assets/images/integration/additional-operations/operations-options.png new file mode 100644 index 00000000..d01f2e6e Binary files /dev/null and b/assets/images/integration/additional-operations/operations-options.png differ diff --git a/assets/images/integration/additional-operations/profiling-feature-flag.png b/assets/images/integration/additional-operations/profiling-feature-flag.png new file mode 100644 index 00000000..ca319923 Binary files /dev/null and b/assets/images/integration/additional-operations/profiling-feature-flag.png differ diff --git a/assets/images/integration/additional-operations/remove-source-records.png b/assets/images/integration/additional-operations/remove-source-records.png new file mode 100644 index 00000000..7de75e16 Binary files /dev/null and b/assets/images/integration/additional-operations/remove-source-records.png differ diff --git a/assets/images/integration/additional-operations/search-key-word.png b/assets/images/integration/additional-operations/search-key-word.png new file mode 100644 index 00000000..6a2b1783 Binary files /dev/null and b/assets/images/integration/additional-operations/search-key-word.png differ diff --git a/assets/images/integration/additional-operations/search-request-id.png b/assets/images/integration/additional-operations/search-request-id.png new file mode 100644 index 00000000..36dbd9c8 Binary files /dev/null and b/assets/images/integration/additional-operations/search-request-id.png differ diff --git a/assets/images/integration/additional-operations/source-record-approval-advanced-mapping.png b/assets/images/integration/additional-operations/source-record-approval-advanced-mapping.png new file mode 100644 index 00000000..35380d40 Binary files /dev/null and b/assets/images/integration/additional-operations/source-record-approval-advanced-mapping.png differ diff --git a/assets/images/integration/additional-operations/source-record-approval-approval-tab.png b/assets/images/integration/additional-operations/source-record-approval-approval-tab.png new file mode 100644 index 00000000..a6c1935c Binary files /dev/null and b/assets/images/integration/additional-operations/source-record-approval-approval-tab.png differ diff --git a/assets/images/integration/additional-operations/source-record-approval-data-set.png b/assets/images/integration/additional-operations/source-record-approval-data-set.png new file mode 100644 index 00000000..c3b28bfc Binary files /dev/null and b/assets/images/integration/additional-operations/source-record-approval-data-set.png differ diff --git a/assets/images/integration/additional-operations/source-record-approval-flow.png b/assets/images/integration/additional-operations/source-record-approval-flow.png new file mode 100644 index 00000000..096c39bb Binary files /dev/null and b/assets/images/integration/additional-operations/source-record-approval-flow.png differ diff --git a/assets/images/integration/additional-operations/source-record-approval-manual-configuration.png b/assets/images/integration/additional-operations/source-record-approval-manual-configuration.png new file mode 100644 index 00000000..832ccc9c Binary files /dev/null and b/assets/images/integration/additional-operations/source-record-approval-manual-configuration.png differ diff --git a/assets/images/integration/additional-operations/source-record-approval-manual-data-entry.png b/assets/images/integration/additional-operations/source-record-approval-manual-data-entry.png new file mode 100644 index 00000000..36a20daa Binary files /dev/null and b/assets/images/integration/additional-operations/source-record-approval-manual-data-entry.png differ diff --git a/assets/images/integration/additional-operations/source-record-approval-notification.png b/assets/images/integration/additional-operations/source-record-approval-notification.png new file mode 100644 index 00000000..034d0852 Binary files /dev/null and b/assets/images/integration/additional-operations/source-record-approval-notification.png differ diff --git a/assets/images/integration/additional-operations/source-record-approval-pre-process-rule.png b/assets/images/integration/additional-operations/source-record-approval-pre-process-rule.png new file mode 100644 index 00000000..47a1fd79 Binary files /dev/null and b/assets/images/integration/additional-operations/source-record-approval-pre-process-rule.png differ diff --git a/assets/images/integration/additional-operations/source-record-approval-process-tab.png b/assets/images/integration/additional-operations/source-record-approval-process-tab.png new file mode 100644 index 00000000..e10f2e9f Binary files /dev/null and b/assets/images/integration/additional-operations/source-record-approval-process-tab.png differ diff --git a/assets/images/integration/additional-operations/source-record-approval-property-rule.png b/assets/images/integration/additional-operations/source-record-approval-property-rule.png new file mode 100644 index 00000000..5d0f28e9 Binary files /dev/null and b/assets/images/integration/additional-operations/source-record-approval-property-rule.png differ diff --git a/assets/images/integration/additional-operations/source-record-validations-advanced-code.png b/assets/images/integration/additional-operations/source-record-validations-advanced-code.png new file mode 100644 index 00000000..21fd4046 Binary files /dev/null and b/assets/images/integration/additional-operations/source-record-validations-advanced-code.png differ diff --git a/assets/images/integration/additional-operations/source-record-validations-advanced-result.png b/assets/images/integration/additional-operations/source-record-validations-advanced-result.png new file mode 100644 index 00000000..1e63c18d Binary files /dev/null and b/assets/images/integration/additional-operations/source-record-validations-advanced-result.png differ diff --git a/assets/images/integration/additional-operations/switch-to-edit-mode.png b/assets/images/integration/additional-operations/switch-to-edit-mode.png new file mode 100644 index 00000000..81a60b1a Binary files /dev/null and b/assets/images/integration/additional-operations/switch-to-edit-mode.png differ diff --git a/assets/images/integration/additional-operations/system-healthchecks.png b/assets/images/integration/additional-operations/system-healthchecks.png new file mode 100644 index 00000000..e228f2a4 Binary files /dev/null and b/assets/images/integration/additional-operations/system-healthchecks.png differ diff --git a/assets/images/integration/additional-operations/validations-advanced-validations.png b/assets/images/integration/additional-operations/validations-advanced-validations.png new file mode 100644 index 00000000..480cd10f Binary files /dev/null and b/assets/images/integration/additional-operations/validations-advanced-validations.png differ diff --git a/assets/images/integration/additional-operations/validations-column-hover.png b/assets/images/integration/additional-operations/validations-column-hover.png new file mode 100644 index 00000000..eb0a265e Binary files /dev/null and b/assets/images/integration/additional-operations/validations-column-hover.png differ diff --git a/assets/images/integration/additional-operations/validations-filter.png b/assets/images/integration/additional-operations/validations-filter.png new file mode 100644 index 00000000..01418678 Binary files /dev/null and b/assets/images/integration/additional-operations/validations-filter.png differ diff --git a/assets/images/integration/additional-operations/validations-filters-sorting-reset.png b/assets/images/integration/additional-operations/validations-filters-sorting-reset.png new file mode 100644 index 00000000..da7f22ef Binary files /dev/null and b/assets/images/integration/additional-operations/validations-filters-sorting-reset.png differ diff --git a/assets/images/integration/additional-operations/validations-invalid-values-fixed.png b/assets/images/integration/additional-operations/validations-invalid-values-fixed.png new file mode 100644 index 00000000..c4998282 Binary files /dev/null and b/assets/images/integration/additional-operations/validations-invalid-values-fixed.png differ diff --git a/assets/images/integration/additional-operations/validations-invalid-values.png b/assets/images/integration/additional-operations/validations-invalid-values.png new file mode 100644 index 00000000..d85ac273 Binary files /dev/null and b/assets/images/integration/additional-operations/validations-invalid-values.png differ diff --git a/assets/images/integration/additional-operations/validations-manual-setup.png b/assets/images/integration/additional-operations/validations-manual-setup.png new file mode 100644 index 00000000..496969b3 Binary files /dev/null and b/assets/images/integration/additional-operations/validations-manual-setup.png differ diff --git a/assets/images/integration/additional-operations/validations-operations.png b/assets/images/integration/additional-operations/validations-operations.png new file mode 100644 index 00000000..16494a1c Binary files /dev/null and b/assets/images/integration/additional-operations/validations-operations.png differ diff --git a/assets/images/integration/additional-operations/validations-pane.png b/assets/images/integration/additional-operations/validations-pane.png new file mode 100644 index 00000000..aabfbe34 Binary files /dev/null and b/assets/images/integration/additional-operations/validations-pane.png differ diff --git a/assets/images/integration/additional-operations/validations-refresh-delete-edit.png b/assets/images/integration/additional-operations/validations-refresh-delete-edit.png new file mode 100644 index 00000000..64612038 Binary files /dev/null and b/assets/images/integration/additional-operations/validations-refresh-delete-edit.png differ diff --git a/assets/images/integration/additional-operations/validations-reset-methods.png b/assets/images/integration/additional-operations/validations-reset-methods.png new file mode 100644 index 00000000..8facc83d Binary files /dev/null and b/assets/images/integration/additional-operations/validations-reset-methods.png differ diff --git a/assets/images/integration/additional-operations/validations-result.png b/assets/images/integration/additional-operations/validations-result.png new file mode 100644 index 00000000..8b9db9e4 Binary files /dev/null and b/assets/images/integration/additional-operations/validations-result.png differ diff --git a/assets/images/integration/additional-operations/validations-search.png b/assets/images/integration/additional-operations/validations-search.png new file mode 100644 index 00000000..71353b62 Binary files /dev/null and b/assets/images/integration/additional-operations/validations-search.png differ diff --git a/assets/images/integration/additional-operations/view-duplicates.png b/assets/images/integration/additional-operations/view-duplicates.png new file mode 100644 index 00000000..d903412b Binary files /dev/null and b/assets/images/integration/additional-operations/view-duplicates.png differ diff --git a/assets/images/integration/additional-operations/view-profiling-for-data-set.png b/assets/images/integration/additional-operations/view-profiling-for-data-set.png new file mode 100644 index 00000000..35d16c24 Binary files /dev/null and b/assets/images/integration/additional-operations/view-profiling-for-data-set.png differ diff --git a/assets/images/integration/additional-operations/view-profiling-for-golden-records.png b/assets/images/integration/additional-operations/view-profiling-for-golden-records.png new file mode 100644 index 00000000..d7b41bd0 Binary files /dev/null and b/assets/images/integration/additional-operations/view-profiling-for-golden-records.png differ diff --git a/assets/images/integration/data-sources/review-mapping-1.png b/assets/images/integration/data-sources/review-mapping-1.png index 6ed2543c..0a252c14 100644 Binary files a/assets/images/integration/data-sources/review-mapping-1.png and b/assets/images/integration/data-sources/review-mapping-1.png differ diff --git a/assets/images/integration/data-sources/review-mapping-2.png b/assets/images/integration/data-sources/review-mapping-2.png index c469f746..e4d3102a 100644 Binary files a/assets/images/integration/data-sources/review-mapping-2.png and b/assets/images/integration/data-sources/review-mapping-2.png differ diff --git a/assets/images/integration/data-sources/review-mapping-3.png b/assets/images/integration/data-sources/review-mapping-3.png index 6b9217e9..930e0970 100644 Binary files a/assets/images/integration/data-sources/review-mapping-3.png and b/assets/images/integration/data-sources/review-mapping-3.png differ diff --git a/assets/images/integration/data-sources/review-mapping-4.png b/assets/images/integration/data-sources/review-mapping-4.png index 12e5c730..3c221cf9 100644 Binary files a/assets/images/integration/data-sources/review-mapping-4.png and b/assets/images/integration/data-sources/review-mapping-4.png differ diff --git a/assets/images/integration/data-sources/review-mapping-5.png b/assets/images/integration/data-sources/review-mapping-5.png index 218e01af..e33b2f92 100644 Binary files a/assets/images/integration/data-sources/review-mapping-5.png and b/assets/images/integration/data-sources/review-mapping-5.png differ diff --git a/assets/images/integration/hourly-ingestion-reports.png b/assets/images/integration/hourly-ingestion-reports.png new file mode 100644 index 00000000..55e83f4a Binary files /dev/null and b/assets/images/integration/hourly-ingestion-reports.png differ diff --git a/assets/images/integration/manual-data-entry/manual-data-entry-add-in-action-center.png b/assets/images/integration/manual-data-entry/manual-data-entry-add-in-action-center.png new file mode 100644 index 00000000..a6abbc09 Binary files /dev/null and b/assets/images/integration/manual-data-entry/manual-data-entry-add-in-action-center.png differ diff --git a/assets/images/integration/manual-data-entry/manual-data-entry-add-single-record.png b/assets/images/integration/manual-data-entry/manual-data-entry-add-single-record.png new file mode 100644 index 00000000..7d0d2b93 Binary files /dev/null and b/assets/images/integration/manual-data-entry/manual-data-entry-add-single-record.png differ diff --git a/assets/images/integration/manual-data-entry/manual-data-entry-create-form-field.png b/assets/images/integration/manual-data-entry/manual-data-entry-create-form-field.png new file mode 100644 index 00000000..b1731685 Binary files /dev/null and b/assets/images/integration/manual-data-entry/manual-data-entry-create-form-field.png differ diff --git a/assets/images/integration/manual-data-entry/manual-data-entry-create-project.png b/assets/images/integration/manual-data-entry/manual-data-entry-create-project.png new file mode 100644 index 00000000..c603b896 Binary files /dev/null and b/assets/images/integration/manual-data-entry/manual-data-entry-create-project.png differ diff --git a/assets/images/integration/manual-data-entry/manual-data-entry-delete-session.png b/assets/images/integration/manual-data-entry/manual-data-entry-delete-session.png new file mode 100644 index 00000000..f8ba51ef Binary files /dev/null and b/assets/images/integration/manual-data-entry/manual-data-entry-delete-session.png differ diff --git a/assets/images/integration/manual-data-entry/manual-data-entry-mapping-general-details.png b/assets/images/integration/manual-data-entry/manual-data-entry-mapping-general-details.png new file mode 100644 index 00000000..978de740 Binary files /dev/null and b/assets/images/integration/manual-data-entry/manual-data-entry-mapping-general-details.png differ diff --git a/assets/images/integration/manual-data-entry/manual-data-entry-mapping-identifiers.png b/assets/images/integration/manual-data-entry/manual-data-entry-mapping-identifiers.png new file mode 100644 index 00000000..7d28198a Binary files /dev/null and b/assets/images/integration/manual-data-entry/manual-data-entry-mapping-identifiers.png differ diff --git a/assets/images/integration/manual-data-entry/manual-data-entry-mapping-primary-identifier.png b/assets/images/integration/manual-data-entry/manual-data-entry-mapping-primary-identifier.png new file mode 100644 index 00000000..717d0839 Binary files /dev/null and b/assets/images/integration/manual-data-entry/manual-data-entry-mapping-primary-identifier.png differ diff --git a/assets/images/integration/manual-data-entry/manual-data-entry-mapping.png b/assets/images/integration/manual-data-entry/manual-data-entry-mapping.png new file mode 100644 index 00000000..725a76ae Binary files /dev/null and b/assets/images/integration/manual-data-entry/manual-data-entry-mapping.png differ diff --git a/assets/images/integration/manual-data-entry/manual-data-entry-move-field.png b/assets/images/integration/manual-data-entry/manual-data-entry-move-field.png new file mode 100644 index 00000000..3b6d40eb Binary files /dev/null and b/assets/images/integration/manual-data-entry/manual-data-entry-move-field.png differ diff --git a/assets/images/integration/manual-data-entry/manual-data-entry-multiple-records.png b/assets/images/integration/manual-data-entry/manual-data-entry-multiple-records.png new file mode 100644 index 00000000..fa0eb367 Binary files /dev/null and b/assets/images/integration/manual-data-entry/manual-data-entry-multiple-records.png differ diff --git a/assets/images/integration/manual-data-entry/manual-data-entry-pick-list.png b/assets/images/integration/manual-data-entry/manual-data-entry-pick-list.png new file mode 100644 index 00000000..8383be8b Binary files /dev/null and b/assets/images/integration/manual-data-entry/manual-data-entry-pick-list.png differ diff --git a/assets/images/integration/manual-data-entry/manual-data-entry-quality.png b/assets/images/integration/manual-data-entry/manual-data-entry-quality.png new file mode 100644 index 00000000..dd6689cc Binary files /dev/null and b/assets/images/integration/manual-data-entry/manual-data-entry-quality.png differ diff --git a/assets/images/integration/manual-data-entry/manual-data-entry-tabular-view-feature-flag.png b/assets/images/integration/manual-data-entry/manual-data-entry-tabular-view-feature-flag.png new file mode 100644 index 00000000..ca7e6cf7 Binary files /dev/null and b/assets/images/integration/manual-data-entry/manual-data-entry-tabular-view-feature-flag.png differ diff --git a/assets/images/key-terms-and-features/codes-1.gif b/assets/images/key-terms-and-features/codes-1.gif index 4bfbcefc..6c69c99e 100644 Binary files a/assets/images/key-terms-and-features/codes-1.gif and b/assets/images/key-terms-and-features/codes-1.gif differ diff --git a/assets/images/key-terms-and-features/codes-2.png b/assets/images/key-terms-and-features/codes-2.png index 64613d04..d724f97d 100644 Binary files a/assets/images/key-terms-and-features/codes-2.png and b/assets/images/key-terms-and-features/codes-2.png differ diff --git a/assets/images/key-terms-and-features/codes-3.png b/assets/images/key-terms-and-features/codes-3.png index 521a1f00..ddc30fe4 100644 Binary files a/assets/images/key-terms-and-features/codes-3.png and b/assets/images/key-terms-and-features/codes-3.png differ diff --git a/assets/images/key-terms-and-features/codes-4.png b/assets/images/key-terms-and-features/codes-4.png index 7d1430cc..88fd7e2d 100644 Binary files a/assets/images/key-terms-and-features/codes-4.png and b/assets/images/key-terms-and-features/codes-4.png differ diff --git a/assets/images/key-terms-and-features/codes-merge-1.gif b/assets/images/key-terms-and-features/codes-merge-1.gif index b14c3d25..8a330b96 100644 Binary files a/assets/images/key-terms-and-features/codes-merge-1.gif and b/assets/images/key-terms-and-features/codes-merge-1.gif differ diff --git a/assets/images/key-terms-and-features/entity-origin-code.png b/assets/images/key-terms-and-features/entity-origin-code.png index 779c45db..ffb0997c 100644 Binary files a/assets/images/key-terms-and-features/entity-origin-code.png and b/assets/images/key-terms-and-features/entity-origin-code.png differ diff --git a/assets/images/key-terms-and-features/linking-golden-records.png b/assets/images/key-terms-and-features/linking-golden-records.png index 7135d609..5cead5e3 100644 Binary files a/assets/images/key-terms-and-features/linking-golden-records.png and b/assets/images/key-terms-and-features/linking-golden-records.png differ diff --git a/assets/images/key-terms-and-features/merging-by-codes-2.png b/assets/images/key-terms-and-features/merging-by-codes-2.png index 5f1ae231..1d5c946f 100644 Binary files a/assets/images/key-terms-and-features/merging-by-codes-2.png and b/assets/images/key-terms-and-features/merging-by-codes-2.png differ diff --git a/assets/images/key-terms-and-features/save-a-search.png b/assets/images/key-terms-and-features/save-a-search.png new file mode 100644 index 00000000..5d4cc8be Binary files /dev/null and b/assets/images/key-terms-and-features/save-a-search.png differ diff --git a/assets/images/key-terms-and-features/save-icon.png b/assets/images/key-terms-and-features/save-icon.png new file mode 100644 index 00000000..b64d338e Binary files /dev/null and b/assets/images/key-terms-and-features/save-icon.png differ diff --git a/assets/images/key-terms-and-features/saved-search.png b/assets/images/key-terms-and-features/saved-search.png new file mode 100644 index 00000000..34d3eb99 Binary files /dev/null and b/assets/images/key-terms-and-features/saved-search.png differ diff --git a/assets/images/key-terms-and-features/search-box-business-domains.png b/assets/images/key-terms-and-features/search-box-business-domains.png new file mode 100644 index 00000000..eb6fedb9 Binary files /dev/null and b/assets/images/key-terms-and-features/search-box-business-domains.png differ diff --git a/assets/images/key-terms-and-features/search-box.png b/assets/images/key-terms-and-features/search-box.png new file mode 100644 index 00000000..b348556b Binary files /dev/null and b/assets/images/key-terms-and-features/search-box.png differ diff --git a/assets/images/key-terms-and-features/search-column-options.png b/assets/images/key-terms-and-features/search-column-options.png new file mode 100644 index 00000000..aa7c240d Binary files /dev/null and b/assets/images/key-terms-and-features/search-column-options.png differ diff --git a/assets/images/key-terms-and-features/search-export-dialog.png b/assets/images/key-terms-and-features/search-export-dialog.png new file mode 100644 index 00000000..523b2074 Binary files /dev/null and b/assets/images/key-terms-and-features/search-export-dialog.png differ diff --git a/assets/images/key-terms-and-features/search-export-golden-records.png b/assets/images/key-terms-and-features/search-export-golden-records.png new file mode 100644 index 00000000..4f889a6f Binary files /dev/null and b/assets/images/key-terms-and-features/search-export-golden-records.png differ diff --git a/assets/images/key-terms-and-features/search-exported-files-page.png b/assets/images/key-terms-and-features/search-exported-files-page.png new file mode 100644 index 00000000..3df84395 Binary files /dev/null and b/assets/images/key-terms-and-features/search-exported-files-page.png differ diff --git a/assets/images/key-terms-and-features/search-results-page-default.png b/assets/images/key-terms-and-features/search-results-page-default.png new file mode 100644 index 00000000..5ad7f6dc Binary files /dev/null and b/assets/images/key-terms-and-features/search-results-page-default.png differ diff --git a/assets/images/key-terms-and-features/search-results-page-sorting.png b/assets/images/key-terms-and-features/search-results-page-sorting.png new file mode 100644 index 00000000..e735eb53 Binary files /dev/null and b/assets/images/key-terms-and-features/search-results-page-sorting.png differ diff --git a/assets/images/key-terms-and-features/search-results-page-view.png b/assets/images/key-terms-and-features/search-results-page-view.png new file mode 100644 index 00000000..842b2442 Binary files /dev/null and b/assets/images/key-terms-and-features/search-results-page-view.png differ diff --git a/assets/images/key-terms-and-features/search-select-property-to-add-as-column.png b/assets/images/key-terms-and-features/search-select-property-to-add-as-column.png new file mode 100644 index 00000000..50c6b860 Binary files /dev/null and b/assets/images/key-terms-and-features/search-select-property-to-add-as-column.png differ diff --git a/assets/images/key-terms-and-features/search-select-vocab-to-add-as-column.png b/assets/images/key-terms-and-features/search-select-vocab-to-add-as-column.png new file mode 100644 index 00000000..4045096f Binary files /dev/null and b/assets/images/key-terms-and-features/search-select-vocab-to-add-as-column.png differ diff --git a/assets/images/key-terms-and-features/search-tabular-view.png b/assets/images/key-terms-and-features/search-tabular-view.png new file mode 100644 index 00000000..274ed6d8 Binary files /dev/null and b/assets/images/key-terms-and-features/search-tabular-view.png differ diff --git a/assets/images/key-terms-and-features/search-tile-view.png b/assets/images/key-terms-and-features/search-tile-view.png new file mode 100644 index 00000000..a67e12ba Binary files /dev/null and b/assets/images/key-terms-and-features/search-tile-view.png differ diff --git a/assets/images/management/access-control/access-control-policy-rule-actions-mask.png b/assets/images/management/access-control/access-control-policy-rule-actions-mask.png new file mode 100644 index 00000000..f2e7c0aa Binary files /dev/null and b/assets/images/management/access-control/access-control-policy-rule-actions-mask.png differ diff --git a/assets/images/management/access-control/access-control-policy-rule-actions-no-value.png b/assets/images/management/access-control/access-control-policy-rule-actions-no-value.png new file mode 100644 index 00000000..1b40919b Binary files /dev/null and b/assets/images/management/access-control/access-control-policy-rule-actions-no-value.png differ diff --git a/assets/images/management/access-control/access-control-policy-rule-actions-view.png b/assets/images/management/access-control/access-control-policy-rule-actions-view.png new file mode 100644 index 00000000..c8420c16 Binary files /dev/null and b/assets/images/management/access-control/access-control-policy-rule-actions-view.png differ diff --git a/assets/images/management/access-control/access-control-policy-rule-actions.png b/assets/images/management/access-control/access-control-policy-rule-actions.png new file mode 100644 index 00000000..a061351e Binary files /dev/null and b/assets/images/management/access-control/access-control-policy-rule-actions.png differ diff --git a/assets/images/management/access-control/access-control-policy-structure.png b/assets/images/management/access-control/access-control-policy-structure.png new file mode 100644 index 00000000..c73edd23 Binary files /dev/null and b/assets/images/management/access-control/access-control-policy-structure.png differ diff --git a/assets/images/management/access-control/activate-deactive-policy.png b/assets/images/management/access-control/activate-deactive-policy.png new file mode 100644 index 00000000..cd40342f Binary files /dev/null and b/assets/images/management/access-control/activate-deactive-policy.png differ diff --git a/assets/images/management/access-control/add-policy-rule.png b/assets/images/management/access-control/add-policy-rule.png new file mode 100644 index 00000000..caec6c5a Binary files /dev/null and b/assets/images/management/access-control/add-policy-rule.png differ diff --git a/assets/images/management/deduplication/create-dedup-project.png b/assets/images/management/deduplication/create-dedup-project.png new file mode 100644 index 00000000..7ab7eab2 Binary files /dev/null and b/assets/images/management/deduplication/create-dedup-project.png differ diff --git a/assets/images/management/entity-type/create-business-domain.png b/assets/images/management/entity-type/create-business-domain.png new file mode 100644 index 00000000..eab45b8b Binary files /dev/null and b/assets/images/management/entity-type/create-business-domain.png differ diff --git a/assets/images/management/rules/power-fx-formula-example.png b/assets/images/management/rules/power-fx-formula-example.png new file mode 100644 index 00000000..a046ec3c Binary files /dev/null and b/assets/images/management/rules/power-fx-formula-example.png differ diff --git a/assets/images/microsoft-integration/power-apps/sync-dataverse-table.png b/assets/images/microsoft-integration/power-apps/sync-dataverse-table.png index db4d8cf7..11c3f50f 100644 Binary files a/assets/images/microsoft-integration/power-apps/sync-dataverse-table.png and b/assets/images/microsoft-integration/power-apps/sync-dataverse-table.png differ diff --git a/assets/images/microsoft-integration/power-apps/sync-entity-types.png b/assets/images/microsoft-integration/power-apps/sync-entity-types.png index def4e3cd..8282cfcd 100644 Binary files a/assets/images/microsoft-integration/power-apps/sync-entity-types.png and b/assets/images/microsoft-integration/power-apps/sync-entity-types.png differ diff --git a/assets/images/playbooks/codes-duplicates.png b/assets/images/playbooks/codes-duplicates.png index e6436067..439fdbf9 100644 Binary files a/assets/images/playbooks/codes-duplicates.png and b/assets/images/playbooks/codes-duplicates.png differ diff --git a/assets/images/playbooks/configure-mapping.png b/assets/images/playbooks/configure-mapping.png index 54cffd4e..1c4d7164 100644 Binary files a/assets/images/playbooks/configure-mapping.png and b/assets/images/playbooks/configure-mapping.png differ diff --git a/assets/images/preparation/enricher/azure-openai-enricher-1.png b/assets/images/preparation/enricher/azure-openai-enricher-1.png new file mode 100644 index 00000000..75019cb2 Binary files /dev/null and b/assets/images/preparation/enricher/azure-openai-enricher-1.png differ diff --git a/assets/images/preparation/enricher/azure-openai-enricher-2.png b/assets/images/preparation/enricher/azure-openai-enricher-2.png new file mode 100644 index 00000000..1e3a931c Binary files /dev/null and b/assets/images/preparation/enricher/azure-openai-enricher-2.png differ diff --git a/assets/images/preparation/enricher/azure-openai-enricher-3.png b/assets/images/preparation/enricher/azure-openai-enricher-3.png new file mode 100644 index 00000000..05a74b99 Binary files /dev/null and b/assets/images/preparation/enricher/azure-openai-enricher-3.png differ diff --git a/assets/images/preparation/enricher/azure-openai-enricher-4.png b/assets/images/preparation/enricher/azure-openai-enricher-4.png new file mode 100644 index 00000000..ce791ab1 Binary files /dev/null and b/assets/images/preparation/enricher/azure-openai-enricher-4.png differ diff --git a/assets/images/preparation/enricher/azure-openai-enricher-5.png b/assets/images/preparation/enricher/azure-openai-enricher-5.png new file mode 100644 index 00000000..8781dc72 Binary files /dev/null and b/assets/images/preparation/enricher/azure-openai-enricher-5.png differ diff --git a/assets/images/preparation/enricher/brreg-enricher-config-1.png b/assets/images/preparation/enricher/brreg-enricher-config-1.png new file mode 100644 index 00000000..9c84e6eb Binary files /dev/null and b/assets/images/preparation/enricher/brreg-enricher-config-1.png differ diff --git a/assets/images/preparation/enricher/brreg-enricher-config-2.png b/assets/images/preparation/enricher/brreg-enricher-config-2.png new file mode 100644 index 00000000..9bb9ac8a Binary files /dev/null and b/assets/images/preparation/enricher/brreg-enricher-config-2.png differ diff --git a/assets/images/preparation/enricher/clearbit-enricher-2.png b/assets/images/preparation/enricher/clearbit-enricher-2.png index c23766df..4483fc0c 100644 Binary files a/assets/images/preparation/enricher/clearbit-enricher-2.png and b/assets/images/preparation/enricher/clearbit-enricher-2.png differ diff --git a/assets/images/preparation/enricher/comapnies-house-enricher-config-1.png b/assets/images/preparation/enricher/comapnies-house-enricher-config-1.png new file mode 100644 index 00000000..cd790e33 Binary files /dev/null and b/assets/images/preparation/enricher/comapnies-house-enricher-config-1.png differ diff --git a/assets/images/preparation/enricher/comapnies-house-enricher-config-2.png b/assets/images/preparation/enricher/comapnies-house-enricher-config-2.png new file mode 100644 index 00000000..80884fa9 Binary files /dev/null and b/assets/images/preparation/enricher/comapnies-house-enricher-config-2.png differ diff --git a/assets/images/preparation/enricher/cvr-enricher-config-1.png b/assets/images/preparation/enricher/cvr-enricher-config-1.png new file mode 100644 index 00000000..f096e45e Binary files /dev/null and b/assets/images/preparation/enricher/cvr-enricher-config-1.png differ diff --git a/assets/images/preparation/enricher/cvr-enricher-config-2.png b/assets/images/preparation/enricher/cvr-enricher-config-2.png new file mode 100644 index 00000000..d6a7db80 Binary files /dev/null and b/assets/images/preparation/enricher/cvr-enricher-config-2.png differ diff --git a/assets/images/preparation/enricher/duck-duck-go-enricher-2.png b/assets/images/preparation/enricher/duck-duck-go-enricher-2.png index 4792f601..7eaf5684 100644 Binary files a/assets/images/preparation/enricher/duck-duck-go-enricher-2.png and b/assets/images/preparation/enricher/duck-duck-go-enricher-2.png differ diff --git a/assets/images/preparation/enricher/gleif-enricher-5.png b/assets/images/preparation/enricher/gleif-enricher-5.png index 2e1b8e34..14e4c9b3 100644 Binary files a/assets/images/preparation/enricher/gleif-enricher-5.png and b/assets/images/preparation/enricher/gleif-enricher-5.png differ diff --git a/assets/images/preparation/enricher/google-maps-enricher-config-1.png b/assets/images/preparation/enricher/google-maps-enricher-config-1.png new file mode 100644 index 00000000..aec5208f Binary files /dev/null and b/assets/images/preparation/enricher/google-maps-enricher-config-1.png differ diff --git a/assets/images/preparation/enricher/google-maps-enricher-config-2.png b/assets/images/preparation/enricher/google-maps-enricher-config-2.png new file mode 100644 index 00000000..da7a7e56 Binary files /dev/null and b/assets/images/preparation/enricher/google-maps-enricher-config-2.png differ diff --git a/assets/images/preparation/enricher/knowledge-graph-enricher-2.png b/assets/images/preparation/enricher/knowledge-graph-enricher-2.png index cf14cd14..c536d2fe 100644 Binary files a/assets/images/preparation/enricher/knowledge-graph-enricher-2.png and b/assets/images/preparation/enricher/knowledge-graph-enricher-2.png differ diff --git a/assets/images/preparation/enricher/libpostal-enricher-2.png b/assets/images/preparation/enricher/libpostal-enricher-2.png index 695d0b17..9300a3bd 100644 Binary files a/assets/images/preparation/enricher/libpostal-enricher-2.png and b/assets/images/preparation/enricher/libpostal-enricher-2.png differ diff --git a/assets/images/preparation/enricher/open-corporates-enricher-2.png b/assets/images/preparation/enricher/open-corporates-enricher-2.png index d1a83793..55cd1299 100644 Binary files a/assets/images/preparation/enricher/open-corporates-enricher-2.png and b/assets/images/preparation/enricher/open-corporates-enricher-2.png differ diff --git a/assets/images/preparation/enricher/permid-enricher-2.png b/assets/images/preparation/enricher/permid-enricher-2.png index e374ad0e..aa046c29 100644 Binary files a/assets/images/preparation/enricher/permid-enricher-2.png and b/assets/images/preparation/enricher/permid-enricher-2.png differ diff --git a/assets/images/preparation/enricher/vatlayer-enricher-2.png b/assets/images/preparation/enricher/vatlayer-enricher-2.png index c560a230..a5ab47d0 100644 Binary files a/assets/images/preparation/enricher/vatlayer-enricher-2.png and b/assets/images/preparation/enricher/vatlayer-enricher-2.png differ diff --git a/assets/images/preparation/enricher/web-enricher-2.png b/assets/images/preparation/enricher/web-enricher-2.png index afe90cbc..86870082 100644 Binary files a/assets/images/preparation/enricher/web-enricher-2.png and b/assets/images/preparation/enricher/web-enricher-2.png differ diff --git a/assets/other/training-company.csv b/assets/other/training-company.csv new file mode 100644 index 00000000..a981d737 --- /dev/null +++ b/assets/other/training-company.csv @@ -0,0 +1,11 @@ +company_id,company_name +1,Brown-Nitzsche +2,Green-Gleason +3,Runolfsson and Sons +4,"Schoen, Bashirian and Roob" +5,Schmidt-Rohan +6,Boehm-Mayert +7,Ratke-McLaughlin +8,Goldner Inc +9,"Reichert, Parisian and Torphy" +10,Monahan Group diff --git a/assets/other/training-data.csv b/assets/other/training-data.csv index 1f8a9cb6..800d18e3 100644 --- a/assets/other/training-data.csv +++ b/assets/other/training-data.csv @@ -37,24 +37,24 @@ id,first_name,last_name,email,job_title 36,Lukas,Greenwood,lgreenwoodz@ovh.net,General Manager 37,Sindee,Gotcliff,sgotcliff10@amazon.com,Database Administrator 38,Mersey,Aspin,maspin11@studiopress.com,Database Administrator -39,Alis,Baly,abaly12@salon.com,Database Administrator +39,Alis,Baly,abaly12@dropbox.com,Database Administrator 40,Ervin,Tann,etann13@shinystat.com,Acountant 41,Glad,Formilli,gformilli14@squarespace.com,Acountant 42,Guthry,De Stoop,gdestoop15@webs.com,Acountant 43,Alissa,Fearon,afearon16@vinaora.com,Acountant 44,Sibella,Preston,spreston17@reddit.com,Acountant -45,Davidde,Scamaden,dscamaden18@bloglovin.com,Acountant +45,Davidde,Scamaden,dscamaden18@dropbox.com,Acountant 46,Keefe,Purdom,kpurdom19@imgur.com,Automation Specialist 47,Ward,Leaman,wleaman1a@technorati.com,Automation Specialist 48,Eberhard,Francesc,efrancesc1b@yellowpages.com,Recruiter -49,Denni,Laye,dlaye1c@wix.com,Account Executive +49,Denni,Laye,dlaye1c@dropbox.com,Account Executive 50,Filberto,Regi,fregi1d@fotki.com,Account Executive 51,Lamond,Acosta,lacosta1e@amazon.co.uk,VP Sales 52,Nance,Tween,ntween1f@hubpages.com,Assistant Manager -53,Haily,Lesper,hlesper1g@t-online.de,Assistant Manager +53,Haily,Lesper,hlesper1g@dropbox.com,Assistant Manager 54,Gustavo,McPeck,gmcpeck1h@europa.eu,Assistant Manager 55,Zack,Cauderlie,zcauderlie1i@umich.edu,Assistant Manager -56,Sloan,Pinfold,spinfold1j@tinypic.com,Help Desk Operator +56,Sloan,Pinfold,spinfold1j@dropbox.com,Help Desk Operator 57,Dion,Feldfisher,dfeldfisher1k@wired.com,Help Desk Operator 58,Franchot,Kelshaw,fkelshaw1l@alexa.com,Senior Quality Engineer 59,Merla,Benallack,mbenallack1m@apple.com,Senior Quality Engineer @@ -65,14 +65,14 @@ id,first_name,last_name,email,job_title 64,Jsandye,Satchell,jsatchell1r@archive.org,Senior Editor 65,Lauree,Vauls,lvauls1s@bing.com,Senior Editor 66,Troy,Raittie,traittie1t@fema.gov,Product Manager -67,Ertha,Doelle,edoelle1u@salon.com,Product Manager +67,Ertha,Doelle,edoelle1u@dropbox.com,Product Manager 68,Gracie,Vigours,gvigours1v@slate.com,Product Manager 69,Sallee,Disdel,sdisdel1w@skyrock.com,Help Desk Operator 70,Way,Leet,wleet1x@wunderground.com,Information Systems Manager 71,Sammy,Laughrey,slaughrey1y@ftc.gov,Information Systems Manager 72,Kristina,Taffs,ktaffs1z@opera.com,Help Desk Operator 73,Abe,MacGilmartin,amacgilmartin20@hud.gov,Help Desk Operator -74,Yvonne,Marder,ymarder21@photobucket.com,Help Desk Operator +74,Yvonne,Marder,ymarder21@dropbox.com,Help Desk Operator 75,Rachel,Bulcock,rbulcock22@amazonaws.com,Help Desk Operator 76,Jamey,Monelle,jmonelle23@tiny.cc,Help Desk Operator 77,Kyle,Orans,korans24@disqus.com,Accounting Assistant @@ -81,12 +81,12 @@ id,first_name,last_name,email,job_title 80,Wallache,Surman,wsurman27@washington.edu,Accounting Assistant 81,Josiah,Legat,jlegat28@economist.com,Accounting Assistant 82,Karla,Spykins,kspykins29@newsvine.com,Software Developer -83,Georas,Nehls,gnehls2a@example.com,Software Developer +83,Georas,Nehls,gnehls2a@dropbox.com,Software Developer 84,Hewie,Tremmil,htremmil2b@wordpress.com,Software Developer 85,Cristobal,Broggini,cbroggini2c@tripadvisor.com,Project Manager 86,Leilah,Parnaby,lparnaby2d@hostgator.com,Project Manager 87,Mureil,Groger,mgroger2e@myspace.com,Legal Assistant -88,Irwinn,Meehan,imeehan2f@cbslocal.com,Legal Assistant +88,Irwinn,Meehan,imeehan2f@dropbox.com,Legal Assistant 89,Rafferty,Goodings,rgoodings2g@scribd.com,Legal Assistant 90,Mathias,Matschoss,mmatschoss2h@howstuffworks.com,Quality Engineer 91,Anna-maria,Petrakov,apetrakov2i@foxnews.com,Account Representative @@ -96,7 +96,7 @@ id,first_name,last_name,email,job_title 95,Farand,Elfitt,felfitt2m@hp.com,Accountant 96,Earvin,Tash,etash2n@blogs.com,Accountant 97,Jenilee,Fishly,jfishly2o@cafepress.com,Research Assistant -98,Beckie,Martinson,bmartinson2p@symantec.com,Research Assistant +98,Beckie,Martinson,bmartinson2p@dropbox.com,Research Assistant 99,Lilla,Kingsworth,lkingsworth2q@google.cn,Research Assistant 100,Kyla,Ferreri,kferreri2r@opera.com,Marketing Assistant 101,Eddy,Shuard,eshuard2s@surveymonkey.com,Marketing Assistant @@ -104,17 +104,17 @@ id,first_name,last_name,email,job_title 103,Marshal,Calverley,mcalverley2u@livejournal.com,Marketing Assistant 104,Jacquenetta,Sparshott,jsparshott2v@moonfruit.com,Marketing Assistant 105,Wesley,Volett,wvolett2w@ask.com,Software Test Engineer -106,Hesther,Hamflett,hhamflett2x@usa.gov,Software Test Engineer +106,Hesther,Hamflett,hhamflett2x@dropbox.com,Software Test Engineer 107,Halli,Predohl,hpredohl2y@t.co,Software Test Engineer 108,Dominique,Wikey,dwikey2z@dedecms.com,Software Test Engineer 109,Basil,Ganning,bganning30@tumblr.com,Software Test Engineer 110,Berny,Duke,bduke31@dion.ne.jp,Technical Writer -111,Vinny,Sprowles,vsprowles32@csmonitor.com,Technical Writer +111,Vinny,Sprowles,vsprowles32@dropbox.com,Technical Writer 112,Aldrich,Jendricke,ajendricke33@independent.co.uk,Technical Writer 113,Odessa,Horsley,ohorsley34@blogspot.com,Technical Writer 114,Tine,Guillond,tguillond35@auda.org.au,Technical Writer 115,Mahala,Hamshar,mhamshar36@oracle.com,Computer Systems Analyst -116,Matilde,Lemme,mlemme37@shinystat.com,Computer Systems Analyst +116,Matilde,Lemme,mlemme37@dropbox.com,Computer Systems Analyst 117,Kimberley,Tiffney,ktiffney38@unblog.fr,Computer Systems Analyst 118,Evie,Mostin,emostin39@walmart.com,Computer Systems Analyst 119,Kristel,Warrell,kwarrell3a@google.pl,Computer Systems Analyst @@ -123,8 +123,8 @@ id,first_name,last_name,email,job_title 122,Esra,Brevetor,ebrevetor3d@bloomberg.com,Senior Quality Engineer 123,Korney,Stych,kstych3e@sina.com.cn,Senior Quality Engineer 124,Bea,Dottridge,bdottridge3f@barnesandnoble.com,Senior Quality Engineer -125,Paton,Duggan,pduggan3g@example.com,Senior Quality Engineer -126,Anthe,O'Cooney,aocooney3h@desdev.cn,Senior Quality Engineer +125,Paton,Duggan,pduggan3g@dropbox.com,Senior Quality Engineer +126,Anthe,O'Cooney,aocooney3h@dropbox.com,Senior Quality Engineer 127,Gianina,Farrey,gfarrey3i@diigo.com,Senior Quality Engineer 128,Erina,Borton,eborton3j@hc360.com,Sales Representative 129,Yvon,Cutforth,ycutforth3k@dyndns.org,Sales Representative diff --git a/assets/other/training-employee.csv b/assets/other/training-employee.csv new file mode 100644 index 00000000..c5a01ed4 --- /dev/null +++ b/assets/other/training-employee.csv @@ -0,0 +1,21 @@ +employee_id,first_name,last_name,company_id +1,Binni,Lindblom,2 +2,Aldric,Green,2 +3,Thaxter,Geale,2 +4,Dyanne,Scotchmore,2 +5,Gustavo,Fox,5 +6,Iago,Latour,5 +7,Hynda,Bertholin,7 +8,Gayelord,Mapes,7 +9,Miller,Bunner,7 +10,Coralyn,Durbyn,7 +11,Lonnie,Kield,8 +12,Sheree,Daines,8 +13,Nan,Eastby,8 +14,Benito,Yurocjhin,10 +15,Klarrisa,Ianne,10 +16,Marylynne,Docwra,10 +17,Armin,Mallabar,10 +18,Alexia,Athow,10 +19,Ava,Gullen,10 +20,Lyssa,Darlasson,10 diff --git a/docs/010-getting-started/020-data-ingestion.md b/docs/010-getting-started/020-data-ingestion.md index bcd107b3..95350802 100644 --- a/docs/010-getting-started/020-data-ingestion.md +++ b/docs/010-getting-started/020-data-ingestion.md @@ -17,7 +17,7 @@ Ingesting data to CluedIn involves three basic steps: importing, mapping, and pr -In this guide, you will learn how to import a file into CluedIn, create a mapping, process the data, and perform data searches. +In this guide, you will learn how to import a file into CluedIn, create a mapping, process the data, and search for golden records. **File for practice:** training-data.csv @@ -29,9 +29,9 @@ A CSV (comma-separated values) file format allows data to be saved in a tabular **To import a file** -1. On the home page, in the **Integrations** section, select **Import From Files**. +1. On the navigation pane, go to **Ingestion**, and then in the **Files** section, select **Add**. - ![import-a-file-1.png](../../assets/images/getting-started/data-ingestion/import-a-file-1.png) + ![files-add.png](../../assets/images/getting-started/data-ingestion/files-add.png) 1. In the **Add Files** section, add the file. You may drag the file or select the file from the computer. @@ -49,9 +49,7 @@ After you uploaded the file, you can view the data from the file as a table with **To view imported data** -1. On the home page, in the **Integrations** section, select **View All Data Sources**. - - Alternatively, on the navigation pane, go to **Integrations** > **Data Sources**. +1. On the navigation pane, go to **Ingestion** > **Sources**. 1. Find and expand the group that you created in the previous procedure. @@ -89,11 +87,11 @@ Mapping is the process of creating a semantic layer for your data so that CluedI 1. Select **Next**. -1. In **Entity Type**, enter the name of a new entity type and select **Create**. An entity type is a specific business object within the organization. A well-named entity type is global and should not be changed (for example, Person, Organization, Car) across sources. +1. In **Business Domain**, enter the name of a new business domain and select **Create**. A business domain is a specific business object within the organization. A well-named business domain is global and should not be changed (for example, Person, Organization, Car) across sources. - The **Entity Type Code** is created automatically; it is a string that represents the entity type in code (for example, in clues). + The **Business Domain Identifier** is created automatically; it is a string that represents the business domain in code (for example, in clues). -1. In **Icon**, select the visual representation of the entity type. +1. In **Icon**, select the visual representation of the business domain. ![create-mapping-3.png](../../assets/images/getting-started/data-ingestion/create-mapping-3.png) @@ -103,7 +101,7 @@ Mapping is the process of creating a semantic layer for your data so that CluedI ![create-mapping-4.png](../../assets/images/getting-started/data-ingestion/create-mapping-4.png) -1. In **Origin**, make sure that the selected field is appropriate for generating a unique identifier (Entity Origin Code) for each record. +1. In **Primary Identifier**, review the field that was selected automatically for generating a unique identifier for each record. ![create-mapping-5.png](../../assets/images/getting-started/data-ingestion/create-mapping-5.png) @@ -131,7 +129,7 @@ Processing turns your data into golden records that can be cleaned, deduplicated 1. Select **Process**. -1. Review information about records that will be processed. Pay attention to the **Origin Entity Code Status** and **Code Status** sections. If there are duplicates, they will be merged during processing. +1. Review information about records that will be processed. Pay attention to the **Primary Identifier Status** section. If there are duplicates, they will be merged during processing. ![process-data-2.png](../../assets/images/getting-started/data-ingestion/process-data-2.png) diff --git a/docs/010-getting-started/030-manual-data-cleaning.md b/docs/010-getting-started/030-manual-data-cleaning.md index b63d656a..f0439923 100644 --- a/docs/010-getting-started/030-manual-data-cleaning.md +++ b/docs/010-getting-started/030-manual-data-cleaning.md @@ -35,11 +35,11 @@ Finding the data that needs to be cleaned involves defining search filters and s 1. In the search field, select the search icon. Then, select **Filter**. -1. In the **Entity Types** dropdown list, select the entity type to filter the records. +1. In the **Business Domains** dropdown list, select the business domain to filter the records. ![find-data-1.png](../../assets/images/getting-started/data-cleaning/find-data-1.png) - As a result, all records with the selected entity type are displayed on the page. By default, the search results are shown in the following columns: **Name**, **Entity Type**, and **Description**. + As a result, all records with the selected business domain are displayed on the page. By default, the search results are shown in the following columns: **Name**, **Business Domain**, and **Description**. 1. To find the specific values that you want to fix, add the corresponding column to the list of search results: @@ -47,7 +47,9 @@ Finding the data that needs to be cleaned involves defining search filters and s 1. Select **Add columns** > **Vocabulary**. - 1. In the search field, enter the name of the vocabulary and start the search. In the search results, select the needed vocabulary key. + 1. In the **Vocabulary Keys** section, expand the vocabulary that contains the needed vocabulary key, and then select the checkbox next to it. + + 1. Move the vocabulary key to the **Selected Vocabulary Keys** section using the arrow pointing to the right. ![find-data-2.png](../../assets/images/getting-started/data-cleaning/find-data-2.png) @@ -71,7 +73,7 @@ After you have found the data that needs to be cleaned, create a clean project. **To create a clean project** -1. In the upper-right corner of the search results page, select the ellipsis button, and then select **Clean**. +1. In the upper-right corner of the search results page, open the three-dot menu, and then select **Clean**. ![create-a-clean-project-1.png](../../assets/images/getting-started/data-cleaning/create-a-clean-project-1.png) @@ -101,7 +103,7 @@ After you have found the data that needs to be cleaned, create a clean project. ![modify-data-2.png](../../assets/images/getting-started/data-cleaning/modify-data-2.png) -1. In the upper-right corner, select **Process**. In the confirmation dialog box, select **Skip stale data** and clear the **Enable rules auto generation** checkbox. Then, confirm that you want to process the data. +1. In the upper-right corner, select **Process**. In the confirmation dialog box, select **Skip** and leave the **Enable automatic generation of data part rules** checkbox cleared. Then, confirm that you want to process the data. ![modify-data-3.png](../../assets/images/getting-started/data-cleaning/modify-data-3.png) diff --git a/docs/010-getting-started/040-deduplication.md b/docs/010-getting-started/040-deduplication.md index bb0da830..ac30f50f 100644 --- a/docs/010-getting-started/040-deduplication.md +++ b/docs/010-getting-started/040-deduplication.md @@ -25,35 +25,29 @@ In this guide, you will learn how to deduplicate the data that you have ingested ## Create deduplication project -As a first step, you need to create a deduplication project that allows you to check for duplicates that belong to a certain entity type. +As a first step, you need to create a deduplication project that allows you to check for duplicates that belong to a certain business domain. **To create a deduplication project** 1. On the navigation pane, go to **Management**. Then, select **Deduplication**. - ![create-dedup-project-1.png](../../assets/images/getting-started/deduplication/create-dedup-project-1.png) - 1. Select **Create Deduplication Project**. 1. On the **Create Deduplication Project** pane, do the following: 1. Enter the name of the deduplication project. - 1. Select the entity type that you want to use as a filter for all records. + 1. Select the business domain that you want to use as a filter for all records. ![dedup-2.png](../../assets/images/getting-started/deduplication/dedup-2.png) 1. In the lower-right corner, select **Create**. - You created the deduplication project. - - ![dedup-3.png](../../assets/images/getting-started/deduplication/dedup-3.png) - - Now, you can proceed to define the rules for checking duplicates within the selected entity type. + You created the deduplication project. Now, you can proceed to define the rules for checking duplicates within the selected business domain. ## Configure matching rule -When creating a matching rule, you need to specify certain criteria. CluedIn uses these criteria to check for matching values among records belonging to the selected entity type. +When creating a matching rule, you need to specify certain criteria. CluedIn uses these criteria to check for matching values among records belonging to the selected business domain. **To configure a matching rule** @@ -87,8 +81,6 @@ When creating a matching rule, you need to specify certain criteria. CluedIn use The status of the deduplication project becomes **Ready to generate**. - ![dedup-7.png](../../assets/images/getting-started/deduplication/dedup-7.png) - 1. In the upper-right corner, select **Generate Results**. Then, confirm that you want to generate the results for the deduplication project. {:.important} @@ -130,19 +122,14 @@ The process of fixing duplicates involves reviewing the values from duplicate re 1. Review the group that will be merged and select **Next**. - 1. Select an option to handle the data merging process if more recent data becomes available for the entity. Then, select **Confirm**. + 1. Select an option to handle the data merging process if more recent data becomes available for the golden record. Then, select **Confirm**. ![dedup-12.png](../../assets/images/getting-started/deduplication/dedup-12.png) - {:.important} - The process of merging data may take some time. - - After the process is completed, you will receive a notification. As a result, the duplicate records have been merged into one record. - - You fixed the duplicate records. + The process of merging data may take some time. After the process is completed, you will receive a notification. As a result, the duplicate records have been merged into one record. {:.important} -All changes to the data records in CluedIn are tracked. You can search for the needed data record and on the **Topology** pane, you can view the visual representation of the records that were merged through the deduplication process. +All changes to golden records in CluedIn are tracked. You can search for the needed golden record and on the **Topology** pane, you can view the visual representation of the records that were merged through the deduplication process. ## Results & next steps diff --git a/docs/010-getting-started/050-data-streaming.md b/docs/010-getting-started/050-data-streaming.md index ae23dc0c..b4b8f874 100644 --- a/docs/010-getting-started/050-data-streaming.md +++ b/docs/010-getting-started/050-data-streaming.md @@ -35,7 +35,7 @@ An export target is a place where you can send the data out of CluedIn after it ![add-export-target-1.png](../../assets/images/getting-started/data-streaming/add-export-target-1.png) -1. On the **Configure** tab, enter the database connection details such as **Host**, **Database Name**, **Username**, and **Password**. Optionally, you may add **Port Number** and **Schema**. +1. On the **Configure** tab, enter the database connection details such as **Name**, **Host**, **Database Name**, **Username**, and **Password**. Optionally, you may add **Port Number**, **Schema**, and **Connection pool size**. ![add-export-target-2.png](../../assets/images/getting-started/data-streaming/add-export-target-2.png) @@ -71,21 +71,23 @@ A stream is a trigger that starts the process of sending the data to the export 1. Go to the **Export Target Configuration** tab. -1. On the **Choose connector** tab, select **Sql Server Connector**, and then select **Next**. +1. On the **Choose Connector** tab, select the Sql Server connector, and then select **Next**. ![create-stream-4.png](../../assets/images/getting-started/data-streaming/create-stream-4.png) -1. On the **Properties to export** tab, enter the **Target Name**. The target name that you enter will be the name of the table in the database. +1. On the **Connector Properties** tab, do the following: -1. Select the **Streaming Mode**. Two streaming modes are available: + 1. Enter the **Target Name**. This will be the name of the table in the database. + + 1. Select the **Streaming Mode**. Two streaming modes are available: - - **Synchronized stream** – the database and CluedIn contain the same data that is synchronized. For example, if you edit the record in CluedIn, the record is also edited in the database. + - **Synchronized stream** – the database and CluedIn contain the same data that is synchronized. For example, if you edit the record in CluedIn, the record is also edited in the database. - - **Event log stream** – every time you make a change in CluedIn, a new record is added to the database instead of replacing the existing record. + - **Event log stream** – every time you make a change in CluedIn, a new record is added to the database instead of replacing the existing record. ![create-stream-5.png](../../assets/images/getting-started/data-streaming/create-stream-5.png) -1. In the **Properties to export** section, click **Auto-select**. All vocabulary keys associated with the records in the strem filter will be displayed on the page. +1. On the **Properties to Export** tab, click **Auto-select**. All vocabulary keys associated with the records in the strem filter will be displayed on the page. ![create-stream-6.png](../../assets/images/getting-started/data-streaming/create-stream-6.png) diff --git a/docs/010-getting-started/060-rule-builder.md b/docs/010-getting-started/060-rule-builder.md index 8d07b647..1dd06079 100644 --- a/docs/010-getting-started/060-rule-builder.md +++ b/docs/010-getting-started/060-rule-builder.md @@ -19,7 +19,9 @@ Rule Builder allows you to create rules for cleaning, transforming, normalizing, In this article, you will learn how to create rules in CluedIn using the Rule Builder tool. You can create a rule either before or after processing the data. -# Create rule +**Before you start:** Make sure you have completed all steps in the [Ingest data guide](/getting-started/data-ingestion). + +## Create rule Creating a rule involves configuring a filter and defining the rule action. @@ -47,7 +49,6 @@ Creating a rule involves configuring a filter and defining the rule action. ![rule-builder-2.png](../../assets/images/getting-started/rule-builder/rule-builder-2.png) - {:.important} The fields for configuring a filter appear one by one. After you complete the previous field, the next field appears. For more information, see [Filters](/key-terms-and-features/filters). 1. In the **Actions** section, select **Add Action**, and then configure the action that CluedIn can perform on the filtered items: @@ -78,7 +79,7 @@ Creating a rule involves configuring a filter and defining the rule action. - If the rule applies to the unprocessed data, process the data as described in the [Ingest data guide](/getting-started/data-ingestion). In this case, the rule will be applied to the records automatically during processing. -# Reprocess records +## Reprocess records After you created the rule for the processed data, you need to reprocess the records to apply the rule. You can reprocess the records in one of the following ways: @@ -102,7 +103,7 @@ After you created the rule for the processed data, you need to reprocess the rec 1. On the navigation pane, go to **Consume** > **GraphQL**. -1. Enter a query to reprocess all records that belong to a certain entity type. Replace _TrainingContact_ with the needed name of entity type. +1. Enter a query to reprocess all records that belong to a certain business domain. Replace _TrainingContact_ with the needed name of business domain. ``` { search(query: "entityType:/TrainingContact") { @@ -116,7 +117,7 @@ After you created the rule for the processed data, you need to reprocess the rec ``` 1. Execute the query. - You reprocessed all records that belong to a certain entity type. Now, the action from the rule is applied to those records. + You reprocessed all records that belong to a certain business domain. Now, the action from the rule is applied to those records. **To reprocess a record manually** @@ -132,11 +133,11 @@ After you created the rule for the processed data, you need to reprocess the rec 1. To reprocess other records, repeat steps 1–2. -# Change rule +## Modify rule After you created the rule, you can [edit](#edit-rule), [inactivate](#inactivate-rule), or [delete](#delete-rule) it. -## Edit rule +### Edit rule If you want to change the rule—name, description, filters, or actions—edit the rule. @@ -152,11 +153,11 @@ If you want to change the rule—name, description, filters, or actions—edit t ![manage-rules-1.png](../../assets/images/management/rules/manage-rules-1.png) - For example, in the previous configuration, the rule added the tag _Prospect_ to all records of the _TrainingContact_ entity type. If you edit the rule filter and change the entity type to _Contact_, then selecting the checkbox will remove the tag from the records of the _TrainingContact_ entity type and add it to the records of the _Contact_ entity type. + For example, in the previous configuration, the rule added the tag _Prospect_ to all records of the _TrainingContact_ business domain. If you edit the rule filter and change the business domain to _Contact_, then selecting the checkbox will remove the tag from the records of the _TrainingContact_ business domain and add it to the records of the _Contact_ business domain. - If you don't want to reprocess the records affected both by the previous and current rule configuration, leave the checkbox unselected, and then confirm your choice. You can reprocess such records later. However, note that reprocessing via the rule details page applies only to the records matching the current rule configuration. To revert rule actions on records matching the previous rule configuration, you'll need to reprocess such records via GraphQL or manually. -## Inactivate rule +### Inactivate rule If you currently do not need the rule, but might need it in future, inactivate the rule. @@ -172,7 +173,7 @@ If you currently do not need the rule, but might need it in future, inactivate t 1. To return the records to which the rule was applied to their original state, [reprocess the records](#reprocess-records). -## Delete rule +### Delete rule If you no longer need the rule, delete it. @@ -186,10 +187,8 @@ If you no longer need the rule, delete it. 1. To return the records to which the rule was applied to their original state, [reprocess the records](#reprocess-records). -# Results - -You learned the basic steps for creating rules to manage your records in CluedIn. You also learned how to apply the actions of the rule to the records associated with the rule. +## Results & next steps -# Next steps +After completing all steps outlined in this guide, you learned how to create rules to manage your records in CluedIn and how to apply the actions of the rule to the records associated with the rule. -- [Create hierarchies](/getting-started/hierarchy-builder) +Next, learn how to visualize relations between golden records with the help of Hierarchy Builder in our [Create hierarchies guide](/getting-started/hierarchy-builder). diff --git a/docs/010-getting-started/070-hierarchy-builder.md b/docs/010-getting-started/070-hierarchy-builder.md index 381dbbab..43497b71 100644 --- a/docs/010-getting-started/070-hierarchy-builder.md +++ b/docs/010-getting-started/070-hierarchy-builder.md @@ -19,41 +19,9 @@ Hierarchy Builder allows you to visualize relations between golden records. For In this article, you will learn how to create hierarchies in CluedIn using the Hierarchy Builder tool. -**Prerequisites** +**Before you start:** Make sure you have completed all steps in the [Ingest data guide](/getting-started/data-ingestion) and [Stream data guide](/getting-started/data-streaming). -Before proceeding with hierarchies, ensure that you have completed the following tasks: - -1. Ingested some data into CluedIn. For more information, see [Ingest data](/getting-started/data-ingestion). - -1. Created a stream that keeps the data synchronized between CluedIn and the Microsoft SQL Server database. For more information, see [Stream data](/getting-started/data-streaming). - -# Enable Manual Hierarchies - -To be able to create hierarchies in CluedIn, enable the **Manual Hierarchies** feature. - -**To enable the Manual Hierarchies feature** - -1. On the navigation pane, go to **Administration** > **Feature Flags**. - -1. Find the **Manual Hierarchies** feature. - - ![enable-feature-1.png](../../assets/images/getting-started/hierarchy-builder/enable-feature-1.png) - -1. Turn on the toggle next to the feature status. Then, confirm that you want to enable the feature. - - The status of the **Manual Hierarchies** feature is changed to **Enabled**. - - ![enable-feature-2.png](../../assets/images/getting-started/hierarchy-builder/enable-feature-2.png) - - Now, you can build hierarchies. - -After enabling the **Manual Hierarchies** feature, the **Hierarchy** tab is added to the golden record page. If you disable the feature, then the tab will no longer be displayed. However, if you enable the feature again, the tab and all previously created hierarchies will be displayed again. - -# Build hierarchy - -After you enabled the **Manual Hierarchies** feature, you can build hierarchies. - -**To build a hierarchy** +## Build a hierarchy 1. On the navigation pane, go to **Management** > **Hierarchy Builder**. @@ -63,14 +31,20 @@ After you enabled the **Manual Hierarchies** feature, you can build hierarchies. 1. Enter the name of the hierarchy. - 1. If you want to limit the records for building the hierarchy, find and select the entity type. - - All records belonging to the selected entity type will be available to build the hierarchy. If you do not select the entity type, then all records existing in the system will be available to build the hierarchy. + 1. If you want to limit the records for building the hierarchy, find and select the business domain. - 1. In the lower-right corner, select **Create**. + All records belonging to the selected business domain will be available to build the hierarchy. If you do not select the business domain, then all records existing in the system will be available to build the hierarchy. ![create-hierarchy-1.png](../../assets/images/getting-started/hierarchy-builder/create-hierarchy-1.png) + 1. In the lower-right corner, select **Next**. + + 1. Select the starting point for the hierarchy project: **Blank** (if you do not have existing relations between golden records) or **From existing relations** (if you have existing relations between golden records). + + ![create-hierarchy-2.png](../../assets/images/getting-started/hierarchy-builder/create-hierarchy-2.png) + + 1. Select **Create**. + The hierarchy builder page opens. 1. Build the visual hierarchy by dragging the records from the left pane to the canvas. @@ -85,13 +59,13 @@ After you enabled the **Manual Hierarchies** feature, you can build hierarchies. You created the hierarchy. -You can view the hierarchy on the **Hierarchy Builder** page or on the **Hierarchies** tab of the golden record page. In addition, you can view the relations between the records on the **Relation** tab of the golden record page. +You can view the hierarchy on the **Hierarchy Builder** page or on the **Hierarchies** tab of the golden record page. In addition, you can view the relations between the records on the **Relations** tab of the golden record page. ![relations.png](../../assets/images/getting-started/hierarchy-builder/relations.png) To make sure that the data in the Microsoft SQL Server database reflects the relations that you set up, [update the stream configuration](#update-stream-configuration). -# Manage hierarchy +## Manage a hierarchy After you created the hierarchy, you can do the following actions with the elements of the hierarchy: @@ -101,11 +75,11 @@ After you created the hierarchy, you can do the following actions with the eleme - Collapse and expand elements. To collapse all elements below a certain element, point to the needed element and select **Collapse**. To expand the collapsed elements, point to the parent element, and select **Expand**. - You can also collapse all elements under a parent element. To do that, in the lower-right corner, select **Collapse all** (![collapse.png](../../assets/images/getting-started/hierarchy-builder/collapse.png)). + You can also collapse all elements under a parent element. To do that, in the lower-right corner, select **Collapse all**. -- View the data associated with the elements. To do that, select the element. +- View the data associated with the elements. To do that, point to the element, select the three-dot menu, and then select **Open Entity**. -# Update stream configuration +## Update stream configuration After you published the hierarchy, update the stream to ensure that the data in the database reflects the relations between the records that you set up. @@ -115,11 +89,11 @@ After you published the hierarchy, update the stream to ensure that the data in 1. Open the needed stream. -1. Go to the **Export Target Configuration** page. Then, select **Edit Target Configuration** and confirm that you want to edit the stream. +1. Go to the **Export Target Configuration** pane. Then, select **Edit Export Configuration** and confirm that you want to edit the stream. -1. On the **Choose a connector** tab, select **Next**. +1. Go to the **Properties to export** tab. To do this, select **Next** two times. -1. On the **Properties to export** tab, find the **Export Edges** section. Then, turn on the **Outgoing** and **Incoming** toggles. +1. In the **Export Edges** section, turn on the **Outgoing** and **Incoming** toggles. ![update-stream-1.png](../../assets/images/getting-started/hierarchy-builder/update-stream-1.png) @@ -131,11 +105,8 @@ After you published the hierarchy, update the stream to ensure that the data in If you update the hierarchy, the relations between records will be automatically updated in the database. -# Results - -You have created a hierarchy in CluedIn. - -# Next steps +## Results & next steps -- [Create glossary](/getting-started/glossary) +After completing all steps outlined in this guide, you learned how to visualize relations between golden records with the help of Hierarchy Builder and how to send these relations to a Microsoft SQL Server database. If you make any changes to the relations in CluedIn, they will be automatically updated in the database. +Next, learn how to use a glossary to document groups of golden records that meet specific criteria in the [Work with glossary guide](/getting-started/glossary). \ No newline at end of file diff --git a/docs/010-getting-started/080-glossary.md b/docs/010-getting-started/080-glossary.md index b462608b..c7fdcccd 100644 --- a/docs/010-getting-started/080-glossary.md +++ b/docs/010-getting-started/080-glossary.md @@ -11,25 +11,19 @@ tags: ["getting-started"] 1. TOC {:toc} -Glossary can help you in documenting groups of records that meet specific criteria, simplifying the process of cleaning and streaming these groups of records. +Glossary can help you in documenting groups of golden records that meet specific criteria, simplifying the process of cleaning and streaming these groups of records.
-In this article, you will learn how to work with the glossary in CluedIn. Glossary consists of terms that are grouped into categories. Each term contains a list of records that correspond to your conditions. +In this article, you will learn how to work with the glossary in CluedIn. Glossary consists of terms that are grouped into categories. Each term contains a list of golden records that correspond to your conditions. Working with the glossary in CluedIn includes creating categories and creating terms within those categories. You can have multiple terms under one category. -**Prerequisites** +**Before you start:** Make sure you have completed all steps in the [Ingest data guide](/getting-started/data-ingestion) and [Stream data guide](/getting-started/data-streaming). -Before proceeding with the glossary, ensure that you have completed the following tasks: - -1. Ingested some data into CluedIn. For more information, see [Ingest data](/getting-started/data-ingestion). - -1. Created a stream that keeps the data synchronized between CluedIn and the Microsoft SQL Server database. For more information, see [Stream data](/getting-started/data-streaming). - -# Create category +## Create category Category refers to a logical grouping of terms. For example, you can have a category named **Customer**, which would contain customer records organized by regions. Each region is a separate term within the category. In other words, a category acts as a folder for terms. @@ -41,7 +35,7 @@ Category refers to a logical grouping of terms. For example, you can have a cate - If you haven't created any categories before, then in the middle of the page, select **Create Category**. - - If you already created some categories, then, in the upper-left corner, select **Create** > **Create Category**. + - If you already created some categories, then, in the upper-left corner of the page, select **Create** > **Create Category**. 1. Enter the category name. Then, in the lower-right corner, select **Create**. @@ -49,9 +43,9 @@ Category refers to a logical grouping of terms. For example, you can have a cate You created the category. Now, you can create a term in the category. -# Create term +## Create term -Term refers to the list of records that meet specific conditions. For example, within the **Customer** category, you can have a term named **North America** encompassing customer records where the **BusinessRegion** vocabulary key is set to **North America**. +Term refers to the list of golden records that meet specific conditions. For example, within the **Customer** category, you can have a term named **North America** encompassing customer records where the **BusinessRegion** vocabulary key is set to **North America**. **To create a term** @@ -63,37 +57,20 @@ Term refers to the list of records that meet specific conditions. For example, w 1. In the **Category** section, leave the **Choose an existing Category** checkbox selected. - 1. Select the category that you created. - - ![create-term-1.png](../../assets/images/getting-started/glossary/create-term-1.png) - - 1. In the lower-left corner, select **Create**. - - You created the term. Now, proceed to define the records that will be included in this term. - -1. In the upper-right corner of the term details page, select **Edit**. + 1. Select the category that you created before. -1. On the **Configuration** tab, in the **Conditions** section, select **Add first rule**, and then specify which records will be included in the term: + 1. In the **Filters** section, define which golden records should be included in the term. - 1. Select the type of property (**Property** or **Vocabulary**). - - 1. Depending on the type of property that you selected, do one of the following: - - For **Property**, find and select the needed property. - - For **Vocabulary**, find and select the needed vocabulary key. - - 1. Select the operation. - - 1. Select the value of the property. + ![create-term-1.png](../../assets/images/getting-started/glossary/create-term-1.png) - ![create-term-2.png](../../assets/images/getting-started/glossary/create-term-2.png) + 1. In the lower-right corner, select **Create**. - {:.important} - The fields for configuring a filter appear one by one. After you complete the previous field, the next field appears. + You created the term. 1. (Optional) Specify additional details about the term: + 1. Select **Edit**. + 1. In the **Certification Level** dropdown list, select the quality level of the records. 1. Enter the **Short Description** and **Long Description** of the term. @@ -104,7 +81,7 @@ Term refers to the list of records that meet specific conditions. For example, w 1. Go to the **Matches** tab to view the records that meet the condition that you set up. - By default, all records are displayed in the following columns: **Name**, **Entity Type**, and **Description**. To add more columns to the table, see step 3 of [Find data](/getting-started/manual-data-cleaning#find-data). + By default, all records are displayed in the following columns: **Name**, **Business Domain**, and **Description**. To add more columns to the table, see step 3 of [Find data](/getting-started/manual-data-cleaning#find-data). 1. Activate the term by turning on the toggle next to the term status. @@ -114,7 +91,7 @@ Term refers to the list of records that meet specific conditions. For example, w Now, you can clean the glossary term and stream it to a Microsoft SQL Server database. As an example, [update the configuration of the stream](#update-stream-configuration) that you created in the [Stream data](/getting-started/data-streaming) guide. -# Manage glossary +## Manage glossary You can do the following actions with the terms in the glossary: @@ -126,7 +103,7 @@ You can do the following actions with the terms in the glossary: ![manage-glossary-2.png](../../assets/images/getting-started/glossary/manage-glossary-2.png) -# Update stream configuration +## Update stream configuration Streaming the glossary terms to the database is more convenient than streaming specific records based on filters. You only need to specify the name of the glossary term, rather than setting filters for properties or vocabulary values. @@ -136,7 +113,7 @@ Streaming the glossary terms to the database is more convenient than streaming s 1. Open the needed stream. -1. On the **Configuration** page, in the **Filters** section, delete the existing filter. +1. On the **Configuration** pane, in the **Filters** section, delete the existing filter. 1. Select **Add First Filter**, and then specify the glossary term that you created: @@ -150,14 +127,12 @@ Streaming the glossary terms to the database is more convenient than streaming s 1. In the upper-right corner select **Save**. Then, confirm that you want to save your changes. - The stream is updated with the new filter. As a result, the database table now contains the records from the glossary term. + The stream is updated with the new filter. As a result, the database table now contains the records from the glossary term. If you update the glossary term, the records will be automatically updated in the database. -# Results - -You created a glossary term in CluedIn. +## Results & next steps -# Next steps +After completing all steps outlined in this guide, you learned how to create a glossary term to document a group of golden records and how to send those golden records from the glossary term to a Microsoft SQL Server database. If you make any changes to the glossary term in CluedIn, the associated records will be automatically updated in the database. -- [Add relations between golden records](/getting-started/relations) \ No newline at end of file +Next, learn how to add edges to build relations between golden records in the [Add relations between records](/getting-started/relations). \ No newline at end of file diff --git a/docs/010-getting-started/090-relations.md b/docs/010-getting-started/090-relations.md index 2890ae0b..cc5c168e 100644 --- a/docs/010-getting-started/090-relations.md +++ b/docs/010-getting-started/090-relations.md @@ -17,47 +17,33 @@ Relations are built between source ("to") and target ("from") records by using e -In this article, you will learn how to add and view relations between golden records. +In this article, you will learn how to add and view relations between golden records. Creating relations between golden records consists of editing the mapping of the data set that you will use as a source in order to add the edge that will link the source to the target. -Creating relations between golden records consists of editing the mapping of the data set that you will use as a source and adding the edge that will link the source to the target. +**Files for practice** + +- File 1: training-data.csv + +- File 2: training-data.csv **Prerequisites** Before proceeding with relations between golden records, ensure that you have completed the following tasks: -1. Ingested (imported, mapped, and processed) the data to which you will be linking the records. For example, this could be the list of companies with the CompanyID and CompanyName columns. For more information, see [Ingest data](/getting-started/data-ingestion). +1. Ingested (uploaded, mapped, and processed) the data to which you will be linking the records. You can use file 1 above. See [Ingest data guide](/getting-started/data-ingestion). -1. Imported and mapped the data that you will be linking to already existing records. For example, this could be the list of employees working for companies from the file in prerequisite 1. The file with employees may contain the following columns: EmployeeName and CompanyID. For more information, see [Import file](/getting-started/data-ingestion#import-file) and [Create mapping](/getting-started/data-ingestion#create-mapping). +1. Uploaded and mapped the data that you will be linking to already existing records. You can use file 2 above. See [Import file](/getting-started/data-ingestion#import-file) and [Create mapping](/getting-started/data-ingestion#create-mapping). -# Edit mapping +## Add edge relations After you imported and mapped the data that you will be linking to already existing records, edit the mapping configuration. -**To edit the mapping** - -1. On the navigation pane, go to **Integration** > **Data Sources**. - -1. Find and select the needed data set. - -1. On the toolbar, select **Map**. Then, select **Edit Mapping**. - -1. Go to the **Map entity** tab. Then, expand the **Entity Codes** section. - -1. Delete the entity code by which you will link the current entity to other entities: - - 1. Select the checkbox next to the entity code. - - 1. Select **Delete Entity Code**. - - ![entity-mapping-1.png](../../assets/images/getting-started/relations/entity-mapping-1.png) - - Now, you can proceed to add the edge relations. +**To add edge relations** -# Add edge relations +1. On the navigation pane, go to **Ingestion** > **Sources**. -Adding edge relations means linking the current data set to already existing records. +1. Find and select the needed data set (for example, file 2 "training-employee.csv"). -**To add edge relations** +1. Go to the **Map** tab, and then select **Edit mapping**. 1. Go to the **Add edge relations** tab, and then select **Add relation**. @@ -75,13 +61,11 @@ Adding edge relations means linking the current data set to already existing rec 1. On the **Configuration** tab, do the following: - 1. Specify the edge type to define the nature of relation between objects. + 1. Specify the edge type to define the nature of relation between objects. You can select the existing edge type or create a new one. To create a new edge type, enter a slash (/) and then enter a name. - You can select the existing edge type or create a new one. To create a new edge type, enter a slash (/) and then enter a name. + 1. Find and select the target business domain to which you will link the records from the current data set. - 1. Find and select the entity type to which you will link the records belonging to the current entity type. - - 1. Enter the origin of the current data set. It will displayed after you process the data. + 1. Define the origin of the target data set. It will displayed after you process the data. ![entity-mapping-2.png](../../assets/images/getting-started/relations/entity-mapping-2.png) @@ -97,7 +81,7 @@ Adding edge relations means linking the current data set to already existing rec After you processed the data and streamed the records, you can view the relations between golden records in the following places: -- In CluedIn: on the **Relations** tab of the record details page. +- In CluedIn: on the **Relations** tab of the golden record details page. ![view-relations-1.png](../../assets/images/getting-started/relations/view-relations-1.png) @@ -109,4 +93,14 @@ After you processed the data and streamed the records, you can view the relation ![view-relations-3.png](../../assets/images/getting-started/relations/view-relations-3.png) - If you add more edge relations between the records, CluedIn will automatically identify the changes and update the stream with new edge relations. \ No newline at end of file + If you add more edge relations between the records, CluedIn will automatically identify the changes and update the stream with new edge relations. + +## Results & next steps + +After completing all steps outlined in this guide, you learned how to add edges to build relations between golden records in CluedIn. You've reached the final part of the Getting Started section. Now might be a great time to dive deeper into the key terms and features of CluedIn: + +- [Golden records](/key-terms-and-features/golden-records) + +- [Identifiers](/key-terms-and-features/entity-codes) + +- [Origin](/key-terms-and-features/origin) \ No newline at end of file diff --git a/docs/030-administration/010-user-management.md b/docs/030-administration/010-user-management.md index 2c06bf9c..339acfe4 100644 --- a/docs/030-administration/010-user-management.md +++ b/docs/030-administration/010-user-management.md @@ -89,4 +89,16 @@ The following diagram shows the flow of deactivating a user in CluedIn. ![deactivate-user-1.png](../../assets/images/administration/user-management/deactivate-user-1.png) - The user won't be able to sign in to CluedIn. \ No newline at end of file + The user won't be able to sign in to CluedIn. + +## User management reference + +Whenever a new user is added to CluedIn or changes are made to the user details page, they are recorded and can be found on the **Audit Log** tab. These actions include the following: + +- Create a user +- Add a user to a role +- Remove a user from a role +- Activate a user +- Deactivate a user +- Delete user invite +- Complete registration \ No newline at end of file diff --git a/docs/030-administration/040-claims.md b/docs/030-administration/040-claims.md index ef5f4f13..987cc719 100644 --- a/docs/030-administration/040-claims.md +++ b/docs/030-administration/040-claims.md @@ -91,7 +91,7 @@ This section focuses on data quality metrics, sensitive data identification, glo **Global Data Model** -This claim governs access to global data model that allows you to explore connections between entity types in the platform. It covers only one action—viewing global data model. +This claim governs access to global data model that allows you to explore connections between business domains in the platform. It covers only one action—viewing global data model. **Metrics** @@ -147,12 +147,12 @@ This section focuses on various possibilities for managing your golden records: **Entity Types** -This claim governs access to entity types (also known as business domains) that describe the semantic meaning of golden records. Learn more about entity types in a dedicated [article](/management/entity-type). +This claim governs access to business domains (previously known as entity types) that describe the semantic meaning of golden records. Learn more about business domains in a dedicated [article](/management/entity-type). This claim covers the following actions: -- Viewing an entity type. -- Creating and editing an entity type. +- Viewing a business domain. +- Creating and editing a business domain. **Access Control** diff --git a/docs/030-administration/070-entity-page-layout.md b/docs/030-administration/070-entity-page-layout.md index 61c1aaad..a3e9e530 100644 --- a/docs/030-administration/070-entity-page-layout.md +++ b/docs/030-administration/070-entity-page-layout.md @@ -14,7 +14,7 @@ An entity page layout is the way in which information about a golden record is a ## Overview of entity page layouts -The entity page layout is assigned to the entity type. This ensures that all golden records belonging to that entity type consistently display relevant information on the golden record overview page. +The entity page layout is assigned to the business domain. This ensures that all golden records belonging to that business domain consistently display relevant information on the golden record overview page. All layouts are stored in **Administration** > **Entity page layouts**. CluedIn contains several built-in entity page layouts: @@ -30,7 +30,7 @@ All layouts are stored in **Administration** > **Entity page layouts**. CluedIn You cannot edit built-in layouts, but you can create custom layouts. -When you create a new entity type while creating mapping, the default layout is assigned to such entity type. You change the layout later by [editing](/management/entity-type#manage-an-entity-type) the entity type. +When you create a new business domain while creating mapping, the default layout is assigned to such business domain. You change the layout later by [editing](/management/entity-type#manage-an-entity-type) the business domain. The layouts consist of the following elements: @@ -54,9 +54,9 @@ This tab contains general information about the layout, including: - Elements that are displayed on the golden record overview page (all or custom core and non-core vocabulary keys, suggested search, and quality metrics). -**Entity types** +**Business domains** -This tab contains all entity types that use the current layout. You can also add more entity types to which the layout will be assigned. +This tab contains all business domains that use the current layout. You can also add more business domains to which the layout will be assigned. ## Create a layout @@ -76,9 +76,9 @@ If built-in layouts are not sufficient for you, you create your own layout. ![create-layout.gif](../../assets/images/administration/entity-page-layout/create-layout.gif) - Alternatively, you can create a layout from the entity type page. You can do it when editing the entity type. + Alternatively, you can create a layout from the business domain page. You can do it when editing the business domain. - After the layout is created, you can assign it to the entity type. As a result, the information on the **Overview** tab of all golden records belonging to the entity type will be arranged according to the selected layout. + After the layout is created, you can assign it to the business domain. As a result, the information on the **Overview** tab of all golden records belonging to the business domain will be arranged according to the selected layout. ## Manage a layout @@ -114,4 +114,4 @@ You can change the layout configuration and choose the elements that should be d ![edit-layout.gif](../../assets/images/administration/entity-page-layout/edit-layout.gif) - The changes will be automatically applied to the overview pages of all golden records that belong to the entity type associated with the current layout. \ No newline at end of file + The changes will be automatically applied to the overview pages of all golden records that belong to the business domain associated with the current layout. \ No newline at end of file diff --git a/docs/040-integration.md b/docs/040-integration.md index d14f7b4b..8e496643 100644 --- a/docs/040-integration.md +++ b/docs/040-integration.md @@ -1,6 +1,6 @@ --- layout: cluedin -title: Integration +title: Ingestion nav_order: 60 has_children: true permalink: /integration @@ -8,7 +8,7 @@ permalink: /integration {: .fs-6 .fw-300 } -The **Integration** module allows you to upload your data into CluedIn, map it to standard fields, and process is to turn your data into golden records. +In the **Ingestion** module, you can upload your data into CluedIn, map it to standard fields, and process is to turn your data into golden records.
@@ -26,4 +26,32 @@ The **Integration** module allows you to upload your data into CluedIn, map it t
Crawlers
Build robust integrations and crawlers
-
\ No newline at end of file + + +When you open the **Ingestion** module, the first thing you see is the dashboard that can simplify and streamline your work with data sources and source records. + +
+ +
+ +The dashboard is a place where you can start the process of uploading the data into CluedIn as well as find general statistics about your data sources. It consists of three main sections. + +**Source actions** + +At the top of the dashboard, you can find the actions to upload the data into CluedIn from a [file](/integration/file), an [ingestion endpoint](/integration/endpoint), and a [database](/integration/databasee). Additionally, you can add a [manual data entry](/integration/manual-data-entry) project and navigate to the list of installed crawlers. Each action card contains a number that indicates the count of data sources of a particular type that are currently in CluedIn. Selecting the number in the file, ingestion endpoint, or database action card will take you to the **Sources** page with data sources filtered by a specific type. To view the list of manual data entry projects, select the number in the corresponding action card. + +**Data set records pending review** + +This table allows you to track the number of records per data set that are in [quarantine](/integration/additional-operations-on-records/quarantine) or [require approval](/integration/additional-operations-on-records/source-records-approval). The table includes up to 20 latest sources, regardless of the owner. However, even if the table displays less than 20 sources, there might be additional sources requiring review. This is because after approving a specific source, only sources that were added later than the approved source will appear in the table. + +To view the records that are currently in quarantine, select the corresponding number in the **Quarantine** column. Similarly, to view the records that require approval, select the corresponding number in the **Requires approval** column. If you are not the owner of the data source, you cannot approve or reject the records on the **Quarantine** or **Approval** tabs of the data set. + +If you want to track records in data sources where you are the owner, go to **Home** > **My tasks**. You will see a similar table with the number of records per data set that are in quarantine or require approval. Each item in the table comes from a data source where you are the owner, so you can go ahead and review the records. + +**Manual data entry project records pending review** + +This table allows you to track the number of records per manual data entry project that [require approval](/integration/additional-operations-on-records/source-records-approval). The table includes up to 20 latest projects, regardless of the owner. However, even if the table displays less than 20 projects, there might be additional projects requiring review. This is because after approving records from a specific manual data entry project, only projects that were added later than the approved project will appear in the table. + +To view the records that require approval, select the corresponding number in the **Requires approval** column. If you are not the owner of the manual data entry project, you cannot approve or reject the records on the **Approval** tabs of the project. + +If you want to track records in manual data entry projects where you are the owner, go to **Home** > **My tasks**. You will see a similar table with the number of records per manual data entry project that require approval. Since you are the owner of each project in the table, you can go ahead and review the records. \ No newline at end of file diff --git a/docs/040-integration/180-additional-operations-on-records.md b/docs/040-integration/180-additional-operations-on-records.md index 9307b058..a9d7219e 100644 --- a/docs/040-integration/180-additional-operations-on-records.md +++ b/docs/040-integration/180-additional-operations-on-records.md @@ -27,6 +27,10 @@ Although normalizing, transforming, and improving the quality of records before CluedIn provides the following tools that you can use to enhance the quality of your data before processing: +- [Preview](/integration/additional-operations-on-records/preview) – analyze the uploaded records and improve their quality before processing. + +- [Validations](/integration/additional-operations-on-records/validations) – check the records for errors, inconsistencies, and missing values and fix these issues to improve the quality of the records. + - [Property rules](/integration/additional-operations-on-records/property-rules) – normalize and transform property values of mapped records. - [Pre-process rules](/integration/additional-operations-on-records/preprocess-rules) – improve the overall quality of mapped records. @@ -35,4 +39,6 @@ CluedIn provides the following tools that you can use to enhance the quality of - [Quarantine](/integration/additional-operations-on-records/quarantine) – handle records that do not meet certain conditions set in property rules, pre-process rules, or advanced mapping. -Additionally, you will learn how to interpret [logs](/integration/additional-operations-on-records/logs) and [monitoring](/integration/additional-operations-on-records/monitoring) statistics to get an insight into what is going on with your records. \ No newline at end of file +- [Approval](/integration/additional-operations-on-records/approval) – approve or reject specific records to ensure that only verified records are sent for processing. + +You will learn how to interpret [logs](/integration/additional-operations-on-records/logs) and [monitoring](/integration/additional-operations-on-records/monitoring) statistics to get an insight into what is going on with your records. Additionally, you will learn about the [removal of records](/integration/additional-operations-on-records/remove-records) that were created from a specific data source. \ No newline at end of file diff --git a/docs/040-integration/200-crawlers.md b/docs/040-integration/200-crawlers.md index 537a53d1..0a84ee15 100644 --- a/docs/040-integration/200-crawlers.md +++ b/docs/040-integration/200-crawlers.md @@ -1,6 +1,6 @@ --- layout: cluedin -nav_order: 030 +nav_order: 040 parent: Integration permalink: /integration/crawlers-and-enrichers title: Crawlers diff --git a/docs/040-integration/210-manual-data-entry.md b/docs/040-integration/210-manual-data-entry.md new file mode 100644 index 00000000..699267de --- /dev/null +++ b/docs/040-integration/210-manual-data-entry.md @@ -0,0 +1,25 @@ +--- +layout: cluedin +nav_order: 030 +parent: Integration +permalink: /integration/manual-data-entry +title: Manual data entry +has_children: true +last_modified: 2025-04-01 +--- + +In addition to loading data from various sources, you can also enter data manually. The **Manual data entry** module allows you to create records directly in CluedIn. This way, you can capture data that may not be available from other sources but is necessary for your data management purposes. + +
+ +
+ +Manual data entry is used to create records in an enforced and structured manner. This means that the Administrator sets up a manual data entry project and defines the schema that will be used to create the records. Data Stewards are then tasked with creating records according to this schema. + +This section covers the following areas: + +- [Configuring a manual data entry project](/integration/manual-data-entry/configure-a-manual-data-entry-project) – learn how to create and configure a manual data entry project, add form fields, review mapping details, and define the quality rating for the source. + +- [Adding records in a manual data entry project](/integration/manual-data-entry/add-records-in-a-manual-data-entry-project) – learn how to add single or multiple the records manually directly to CluedIn. + +- [Managing a manual data entry project](/integration/manual-data-entry/manage-a-manual-data-entry-project) – learn how to make changes to the manual data entry project and how to manage access to the manual data entry project and the records created within the project. \ No newline at end of file diff --git a/docs/040-integration/additional-operations-on-records/060-advanced-mapping.code.md b/docs/040-integration/additional-operations-on-records/030-advanced-mapping.code.md similarity index 99% rename from docs/040-integration/additional-operations-on-records/060-advanced-mapping.code.md rename to docs/040-integration/additional-operations-on-records/030-advanced-mapping.code.md index d5515116..0780d074 100644 --- a/docs/040-integration/additional-operations-on-records/060-advanced-mapping.code.md +++ b/docs/040-integration/additional-operations-on-records/030-advanced-mapping.code.md @@ -180,7 +180,7 @@ if (getVocabularyKeyValue('customer.industry') === 'Oil & Gas') { **addCode** -Adds a code to the clue. The code usually consists of an entity type, an origin, and a specific value. You can specify the required origin and value to generate the code. +Adds an identifier (previously known as code) to the clue. The identifier usually consists of a business domain (previously known as entity type), an origin, and a specific value. You can specify the required origin and value to generate the code. In the following example, the first parameter is the origin and the second parameter is the value. The resulting code would be `"/Customer#myOrigin:myCode"`. diff --git a/docs/040-integration/additional-operations-on-records/030-quarantine.md b/docs/040-integration/additional-operations-on-records/040-quarantine.md similarity index 100% rename from docs/040-integration/additional-operations-on-records/030-quarantine.md rename to docs/040-integration/additional-operations-on-records/040-quarantine.md diff --git a/docs/040-integration/additional-operations-on-records/050-approval.md b/docs/040-integration/additional-operations-on-records/050-approval.md new file mode 100644 index 00000000..6d9bbdcb --- /dev/null +++ b/docs/040-integration/additional-operations-on-records/050-approval.md @@ -0,0 +1,176 @@ +--- +layout: cluedin +nav_order: 5 +parent: Additional operations +grand_parent: Integration +permalink: /integration/additional-operations-on-records/approval +title: Approval +last_modified: 2025-04-03 +--- +## On this page +{: .no_toc .text-delta } +- TOC +{:toc} + +In this article, you will learn how to implement source record approval, which allows you to approve or reject specific records and ensure that only verified records are sent into processing. + +
+ +
+ +You can configure source record approval for data sources of any type (file, ingestion endpoint, database) and for manual data entry projects. + +## Source record approval flow + +When source record approval is configured for a data source, each record goes through several stages. When a non-owner user processes the data set or when a new ingestion appears for the data set that is in the auto-submission mode, CluedIn first checks if the record should be sent to [quarantine](/integration/additional-operations-on-records/quarantine). If yes, the record is sent to quarantine and does not move forward in the processing pipeline. When a quarantined record is approved, it moves to the next stage where CluedIn checks if the record requires approval. If yes, the record is sent to the approval area and does not move forward in the processing pipeline. When the record is approved, it moves to processing, resulting in the creation of a new golden record or aggregation to an existing golden record. + +![source-record-approval-flow.png](../../assets/images/integration/additional-operations/source-record-approval-flow.png) + +{:.important} +After source record approval, the record does not go back to any previous stages in the processing pipeline; it goes straight into processing. + +**Difference between quarantine and approval** + +When the record is in quarantine, you can edit it and fix some data quality issues. When the record is in the approval area, you cannot edit it; you can only approve or reject it. + +## Configure source record approval + +In this section, you will learn how to configure source record approval in data sets and manual data entry projects. Only owners can configure source record approval. + +### Source record approval in data sets + +Source record approval is particularly useful for data sets created via the ingestion endpoint. If you have already completed the initial full data load and now want to ingest only delta records on a daily basis, the approval mechanism can help you ensure that only verified delta records are processed. The owners of the data source can then review such delta records and decide whether they should be processed. + +You can configure source record approval using one of the following methods: + +- [Property rules](/integration/additional-operations-on-records/property-rules) – use to define the conditions for sending the source record for approval based on the property values of the record. + +- [Pre-process rules](/integration/additional-operations-on-records/preprocess-rules) – use to define the conditions for sending the source record for approval based on the whole record, and not its specific property values. + +- [Advanced mapping code](/integration/additional-operations-on-records/advanced-mapping-code) – use to define complex logic for sending the source record for approval. + +**To configure source record approval in a data set** + +1. On the navigation pane, go to **Ingestion** > **Sources**. + +1. Find and open the data source and its data set for which you want to configure source record approval. + +1. Define the rules for sending the source records for approval in one of the following ways: + + - By adding a property rule: + + 1. Go to the **Map** tab, and then select **Edit mapping**. Make sure you are on the **Map columns to vocabulary key** tab. + + 1. Find the vocabulary key by which you want to determine whether the record should be sent for approval. Then, next to this vocabulary key, select **Add property rule**. + + 1. In the **Filter** section, select whether you want to apply the rule to all values of the records (**Apply action to all values**) or to specific values (**Add filter to apply action**). For example, you might want to send the record for approval if the vocabulary key contains a specific value. + + 1. In the **Action** dropdown list, select **Require approval**. + + ![source-record-approval-property-rule.png](../../assets/images/integration/additional-operations/source-record-approval-property-rule.png) + + 1. Select **Add Rule**. + + - By adding a pre-process rule: + + 1. Go to the **Pre-process Rules** tab, and then select **Add Pre-process Rule**. + + 1. Enter the name of the rule. + + 1. In the **Filter** section, select whether you want to apply the rule to all records (**Apply action to all values**) or to specific records (**Add filter to apply action**). + + 1. In the **Action** dropdown list, select **Require approval**. + + ![source-record-approval-pre-process-rule.png](../../assets/images/integration/additional-operations/source-record-approval-pre-process-rule.png) + + 1. Select **Add Rule**. + + - By writing advanced mapping code: + + 1. Go to the **Map** tab, and then select **Advanced**. + + 1. On the left side of the page, write the code to define the conditions for sending the source records for approval. You can write any JavaScript code. + + ![source-record-approval-advanced-mapping.png](../../assets/images/integration/additional-operations/source-record-approval-advanced-mapping.png) + + 1. Select **Save**. + +1. Go to the **Process** tab, and make sure that **Auto-submission** is enabled. This way only delta records will be sent for approval, rather than all records from the **Preview** tab. + + Now, when delta records are ingested to CluedIn and they meet the conditions in pre-process rules, property rules, or advanced mapping, a new entry appears on the **Process** tab, informing that these records have been sent for approval. + + ![source-record-approval-process-tab.png](../../assets/images/integration/additional-operations/source-record-approval-process-tab.png) + + Additionally, the owners of the data source will receive a notification in CluedIn about the records pending approval. The owners of the data source can now [review](#review-records-pending-approval) these records and approve or reject them. + +### Source record approval in manual data entry project + +Source record approval is useful for [manual data entry projects](/integration/manual-data-entry) as it grants project owners full control over the records created in the project. The owners of the manual data entry project can review new records added by non-owner users and decide whether they should be processed and turned into golden records. + +**To configure source record approval for manual data entry** + +1. On the navigation pane, go to **Ingestion** > **Manual Data Entry**. + +1. Find and open the manual data entry project for which you want to configure source record approval. + +1. In the upper-right corner, select **Edit**. + +1. In the **Require records approval** section, select the checkbox to enable the approval mechanism. + + ![source-record-approval-manual-configuration.png](../../assets/images/integration/additional-operations/source-record-approval-manual-configuration.png) + +1. Select **Save**. + + Now, when a non-owner user adds a new record in the manual data entry project, you will receive a notification about a record pending your approval. You can then [review](#review-records-pending-approval) such record and approve or reject it. + +## Review records pending approval + +To review source records pending approval, you require the following: + +- You should be added to the **Permissions** tab of the data source. This is necessary to view source records. For more information about access to data, see [Source control](/administration/user-access/data-access#source-control). + +- You should be added to the **Owners** tab of the data set. This is necessary to approve or reject source records. For more information about the right to approve and reject, see [Ownership](/administration/user-access/feature-access#ownership). + +If you have the required permissions to the data source and you are among the owners of the data source, you will receive a notification every time the records pending your approval appear in CluedIn. Additionally, you will receive a daily email summarizing the records that require your approval. + +To track the records that require your approval, on the navigation pane, go to **Home** > **My Tasks**. Here, you will find two tables: + +- **Data set records pending review** – this table allows you to track the number of records per data set that are in quarantine or require approval. The table contains information about the data sources for which you are the owner. + + ![source-record-approval-data-set.png](../../assets/images/integration/additional-operations/source-record-approval-data-set.png) + +- **Manual data entry project records pending review** – this table allows you to track the number of records per manual data entry project that require approval. The table contains information about the manual data entry projects for which you are the owner. + + ![source-record-approval-manual-data-entry.png](../../assets/images/integration/additional-operations/source-record-approval-manual-data-entry.png) + +**To review records pending approval** + +1. Find the records pending approval by doing one of the following: + + - In the notification about records being sent for approval, select **View**. + + ![source-record-approval-notification.png](../../assets/images/integration/additional-operations/source-record-approval-notification.png) + + - In the daily email about records pending approval, select the number of records that require your approval. + + - In the **Data set records pending review** or **Manual data entry project records pending review** table, select the number of records that require your approval. + + As a result, the **Approval** tab of the data set opens. + +1. On the **Approval** tab, review the records. To find the reason why a record has been sent for approval, select the icon in the **Details** column. Then, do one of the following: + + - If the source record should be processed, select the check mark in the **Actions** column for this record. + + ![source-record-approval-approval-tab.png](../../assets/images/integration/additional-operations/source-record-approval-approval-tab.png) + + You can approve records one by one, or you can approve them in bulk. To approve the records in bulk, select checkboxes in the first column for the needed records. Then, open the three-dot menu, and select **Approve clues**. + + To approve all records on the page, select the checkbox in the first column header. Then, open the three-dot menu, and select **Approve clues**. Keep in mind that this action approves only the records that are on the page, not all records on all the other pages. + + Every time you approve a record or a bunch of records, the corresponding entry is added to the **Process** tab meaning that these records have now been processed. + + - If the source record is invalid, wrong, and should not be processed, select the cross mark in the **Actions** column for this record. + + You can reject records one by one, or you can reject them in bulk. To reject the records in bulk, select checkboxes in the first column for the needed records. Then, open the three-dot menu above and select **Reject clues**. + + To reject all records on the page, select the checkbox in the first column header. Then, open the three-dot menu, and select **Reject clues**. Keep in mind that this action rejects only the records that are on the page, not all records on all the other pages. To reject all records, open the three-dot menu and select **Reject all clues**. \ No newline at end of file diff --git a/docs/040-integration/additional-operations-on-records/050-monitoring.md b/docs/040-integration/additional-operations-on-records/050-monitoring.md deleted file mode 100644 index fa143506..00000000 --- a/docs/040-integration/additional-operations-on-records/050-monitoring.md +++ /dev/null @@ -1,76 +0,0 @@ ---- -layout: cluedin -nav_order: 6 -parent: Additional operations -grand_parent: Integration -permalink: /integration/additional-operations-on-records/monitoring -title: Monitoring -tags: ["integration", "monitoring"] -last_modified: 2023-11-07 ---- -## On this page -{: .no_toc .text-delta } -- TOC -{:toc} - -In this article, you will learn about the features on the **Monitoring** tab to gain insight into what is happening with your records and help you quickly identify issues. - -## Monitoring for all types of data sources - -Regardless of the type of data source, the **Monitoring** tab in the data set includes the following sections: - -- **Total** – here you can view general information about the records from the current data set, including the total number of records, original columns, mapped columns, and records in quarantine. This is a useful tool to compare the number of original columns and mapped columns. - -- **Global queues** – here you can view global statistics on ingestion and processing requests from all data sets. This is a useful tool to ensure that the system runs correctly. - -- **Data set queues** – here you can view statistics on the records in the current data set during different stages of their life cycle (loading, mapping, processing). - -In the data set created using an endpoint, these sections are located in the **Overview** area. - -If the number of messages of any type is greater than 0 while the number of consumers is 0, there may be an issue with your data. The following screenshot illustrates a situation where troubleshooting is needed to fix the processing of records. - -![monitoring-1.png](../../assets/images/integration/additional-operations/monitoring-1.png) - -To get better visibility into what is happening with your records, in addition to monitoring, use **system healthchecks**. You can find them in the upper-right corner of CluedIn. - -![monitoring-3.png](../../assets/images/integration/additional-operations/monitoring-3.png) - -If the status of any item is red or orange, it means that something is wrong and some services probably need to be restarted. To fix the problem, contact the person responsible for maintaining CluedIn for your organization (for example, system administrator) who can restart the needed service. - -The following table provides descriptions of each message queue and corresponding troubleshooting guidelines. If the number of messages doesn't return to 0 or if the number of consumers remains 0, refer to the **Troubleshooting** column for recommended actions. - -| Queue | Description | Troubleshooting | -|--|--|--| -| Ingestion data set | Messages representing JSON objects sent to various endpoints. | If you are a system administrator, restart the pod named "datasource-processing". | -| Commit data set | Messages representing requests for data set processing. Messages can be added by selecting the **Process** button on the **Process** tab of the data set or each time the endpoint receives data with the auto-submission enabled. | If you are a system administrator, restart the pod named "datasource-processing". | -| Submitting Messages | Messages containing JSON objects sent to the mapping service to be converted into records during processing. | Go to the **Process** tab and select **Cancel**. If you are a system administrator, verify the status of the mapping service and restart the pod containing the name "annotation". | -| Processing Messages | Messages containing records sent to the processing pipeline. | If you are a system administrator, restart the pod containing the name "submitter". | -| Quarantine Messages | Messages containing records that were approved on the **Quarantine** tab and sent to the processing pipeline. | If you are a system administrator, restart the pod containing the name "submitter". | -| Loading Failures | Messages containing records from the data set that cannot be fully loaded. | Go to the **Preview** tab and select **Retry**. | -| Error Processing Messages | Messages containing records that could not be processed by the processing pipeline because it does not respond. | Go to the **Process** tab and select **Retry**. If you are a system administrator, verify the status of processing pods. | - -## Monitoring for endpoints - -In the data set created using an endpoint, the **Monitoring** tab includes two areas: **Overview** and **Ingestion reports**. The **Overview** area contains statistics described in the previous section. This section focuses on the **Ingestion reposts** area. - -![ingestion-reports.png](../../assets/images/integration/additional-operations/ingestion-reports.png) - -The **Ingestion reposts** area contains a table with detailed reports generated for each request sent to an endpoint. The table contains the following columns: - -- **ReceiptID** – unique identifier of a request. Every request you send to an endpoint, whether successful or not, receives a unique receipt ID. This ID allows you to quickly locate the request report in CluedIn. Simply copy the receipt ID from the request response and paste it to the search field above the table. - -- **Received** – the number of records received by CluedIn. If the number is 0, it means that the request contained errors and CluedIn rejected it. - -- **Loaded** – the number of records loaded into CluedIn. This column contains three categories: - - - **Success** – the number of records that were successfully loaded into CluedIn. - - - **Failed** – the number of records that failed to load into CluedIn. - - - **Retry** – the number of records that attempted to reload into CluedIn. - -- **Logs** – the number of logs generated for a specific request. You can view the log details by selecting the content of the cell. Keep in mind that for endpoints, we only log warnings. These logs are the same as those found on the **Logs** tab of the dataset. The difference is that the **Logs** tab contains logs for all requests, while the **Ingestion reports** table provides logs for each specific request. For more information on how to read logs, see the [Logs](/integration/additional-operations-on-records/logs) documentation. - -- **Processed** – the number of times the records from a particular request have been processed in CluedIn. - -- **Created at** – the timestamp indicating when the ingestion report was generated. This corresponds to the time when the HTTP request was executed. diff --git a/docs/040-integration/additional-operations-on-records/040-logs.md b/docs/040-integration/additional-operations-on-records/060-logs.md similarity index 99% rename from docs/040-integration/additional-operations-on-records/040-logs.md rename to docs/040-integration/additional-operations-on-records/060-logs.md index 645791c6..914064cf 100644 --- a/docs/040-integration/additional-operations-on-records/040-logs.md +++ b/docs/040-integration/additional-operations-on-records/060-logs.md @@ -1,6 +1,6 @@ --- layout: cluedin -nav_order: 5 +nav_order: 6 parent: Additional operations grand_parent: Integration permalink: /integration/additional-operations-on-records/logs diff --git a/docs/040-integration/additional-operations-on-records/070-monitoring.md b/docs/040-integration/additional-operations-on-records/070-monitoring.md new file mode 100644 index 00000000..97f881e6 --- /dev/null +++ b/docs/040-integration/additional-operations-on-records/070-monitoring.md @@ -0,0 +1,130 @@ +--- +layout: cluedin +nav_order: 7 +parent: Additional operations +grand_parent: Integration +permalink: /integration/additional-operations-on-records/monitoring +title: Monitoring +tags: ["integration", "monitoring"] +last_modified: 2025-02-04 +--- +## On this page +{: .no_toc .text-delta } +- TOC +{:toc} + +In this article, you will learn about the features on the **Monitoring** tab to gain insight into what is happening with your records and help you quickly identify issues. + +## Common monitoring elements + +Regardless of the data source type—file, ingestion endpoint, or database—the **Monitoring** tab in the data set includes the following sections: + +- **Total** – here you can view general information about the records from the current data set, including the total number of records, original columns, mapped columns, and records in quarantine. This is a useful tool to compare the number of original columns and mapped columns. + + ![monitoring-total.png](../../assets/images/integration/additional-operations/monitoring-total.png) + +- **Global queues** – here you can view global statistics on ingestion and processing requests from all data sets. This is a useful tool to ensure that the system runs correctly. + + ![monitoring-global-queues.png](../../assets/images/integration/additional-operations/monitoring-global-queues.png) + +- **Data set queues** – here you can view statistics on the records in the current data set during different stages of their life cycle (loading, mapping, processing). + + ![monitoring-data-set-queues.png](../../assets/images/integration/additional-operations/monitoring-data-set-queues.png) + +If the number of messages of any type is greater than 0 while the number of consumers is 0, there may be an issue with your data. The following screenshot illustrates a situation where troubleshooting is needed to fix the processing of records. + +![monitoring-error.png](../../assets/images/integration/additional-operations/monitoring-error.png) + +To get better visibility into what is happening with your records, in addition to monitoring, use **system healthchecks**. You can find them in the upper-right corner of CluedIn. + +![system-healthchecks.png](../../assets/images/integration/additional-operations/system-healthchecks.png) + +If the status of any item is red or orange, it means that something is wrong, and some services probably need to be restarted. To fix the problem, contact the person responsible for maintaining CluedIn for your organization (for example, system administrator) who can restart the needed service. + +The following table provides the description of each queue and corresponding troubleshooting guidelines. If the number of messages doesn't return to 0 or if the number of consumers remains 0, refer to the **Troubleshooting** column for recommended actions. + +| Queue | Description | Troubleshooting | +|--|--|--| +| Ingestion data set | Messages representing JSON objects sent to various endpoints. | If you are a system administrator, restart the pod named "datasource-processing". | +| Commit data set | Messages representing requests for data set processing. Messages can be added by selecting the **Process** button on the **Process** tab of the data set or each time the endpoint receives data with the auto-submission enabled. | If you are a system administrator, restart the pod named "datasource-processing". | +| Submitting Messages | Messages containing JSON objects sent to the mapping service to be converted into records during processing. | Go to the **Process** tab and select **Cancel**. If you are a system administrator, verify the status of the mapping service and restart the pod named "annotation". | +| Processing Messages | Messages containing records sent to the processing pipeline. | If you are a system administrator, restart the pod named "submitter". | +| Quarantine Messages | Messages containing records that were approved on the **Quarantine** tab and sent to the processing pipeline. | If you are a system administrator, restart the pod named "submitter". | +| Loading Failures | Messages containing records from the data set that cannot be fully loaded. | Go to the **Preview** tab and select **Retry**. | +| Error Processing Messages | Messages containing records that could not be processed by the processing pipeline because it does not respond. | Go to the **Process** tab and select **Retry**. If you are a system administrator, verify the status of processing pods. | + +## Monitoring for ingestion endpoints + +In a data set created from an ingestion endpoint, the **Monitoring** tab consists of three areas: + +- [Overview](#overview) + +- [Ingestion reports](#ingestion-reports) + +- [Ingestion anomalies](#ingestion-anomalies) + +### Overview + +The **Overview** area contains general statistics about the records and queues described in [Common monitoring elements](#common-monitoring-elements). + +### Ingestion reports + +The **Ingestion reposts** area contains information about each request sent to the specific ingestion endpoint in CluedIn. This information is presented in a table that consists of the following columns: + +- **ReceiptID** – unique identifier of a request. Each request you send to an ingestion endpoint in CluedIn, whether successful or not, has a unique receipt ID. + + ![monitoring-postman-receipt-id.png](../../assets/images/integration/additional-operations/monitoring-postman-receipt-id.png) + + This ID allows you to quickly locate the request report in CluedIn. Simply copy the receipt ID from the request response and paste it to the search box above the table. + + ![monitoring-search-receipt-id.png](../../assets/images/integration/additional-operations/monitoring-search-receipt-id.png) + + To view the records that were sent to CluedIn in that specific request, select the receipt ID. As a result, the records appear in the **Ingested records** pane. + + ![monitoring-receipt-id-ingested-records.png](../../assets/images/integration/additional-operations/monitoring-receipt-id-ingested-records.png) + + Alternatively, you can copy the receipt ID, go to the **Preview** tab, paste the receipt ID in the search box, and start to search. As a result, only the records sent in a specific request will be displayed on the page. + +- **Received** – the number of records received by CluedIn. If the number is 0, it means that the request contained errors and CluedIn rejected it. + +- **Loaded** – the number of records loaded into CluedIn. This column contains three categories: + + - **Success** – the number of records that were successfully loaded into CluedIn. + + - **Failed** – the number of records that failed to load into CluedIn. + + - **Retry** – the number of records that attempted to reload into CluedIn. + +- **Logs** – the number of logs generated for a specific request. You can view the log details by selecting the content of the cell. Keep in mind that for ingestion endpoints, we only log warnings. These logs are the same as those found on the **Logs** tab of the dataset. The difference is that the **Logs** tab contains logs for all requests, while the **Ingestion reports** table provides logs for each specific request. For more information on how to read logs, see the [Logs](/integration/additional-operations-on-records/logs) documentation. + +- **Processed** – the number of times the records from a specific request have been processed in CluedIn. + +- **Created at** – the timestamp indicating when the ingestion report was generated. This corresponds to the time when the HTTP request was executed. + +- **Produced golden records** – the golden records generated from a specific request or those to which records from the request were aggregated as data parts. To view a list of golden records produced from a specific request, select **View ingested records**. As a result, the golden records appear in the **Produced golden records** pane. + + ![monitoring-receipt-id-produced-golden-records.png](../../assets/images/integration/additional-operations/monitoring-receipt-id-produced-golden-records.png) + +- **Actions** – currently, this column provides the possibility to remove records—golden records or data parts that were aggregated to the existing golden records—produced from a specific request. If you want to remove the records produced from a specific request, select **Remove records**. In the confirmation dialog, you can select the checkbox to remove the source records from the temporary storage on the **Preview** tab. To confirm your choice, enter _DELETE_, and then start the removal process. + + ![monitoring-remove-records.png](../../assets/images/integration/additional-operations/monitoring-remove-records.png) + + Once the records are removed, they will no longer be displayed when you select **View ingested records** for a specific request. The records removal mechanism is similar to the one described in [Remove records](/integration/additional-operations-on-records/remove-records), with the difference being that the article describes removing all records produced from the data set, while you have the option to remove records produced from a specific request that contributes to the data set. + +### Ingestion anomalies + +The **Ingestion anomalies** area contains a list of potential errors that can occur with the data set, along with remediation steps and the status for each error. This area provides a quick and easy way to monitor the data set for any problems, ensuring you have complete visibility. If you notice that the status of any error indicates a problem, refer to the remediation steps. + +The following table contains the list of errors and remediation steps. You can find a similar table on the **Monitoring** tab in CluedIn. + +| Error name | Remediation | +|--|--| +| **Error in logs**
This can occur due to various elements, generally data-related. For example, an invalid field name in the record or an unsupported value for a given data type. | Go to the **Logs** tab and filter the logs by the **Error** level. Then, select a log to view the detailed reason for the error. | +| **Loading failure**
This can occur when the search databases are under heavy load. | Go to the **Preview** tab. You should see the **Retry** button there. Select the **Retry** button to attempt reloading your records. If this does not resolve the issue, please reach out to our support team. | +| **Error in submissions**
This can occur when the CluedIn processing pipeline is under intense load and cannot handle all the incoming requests. | Go to the **Process** tab. You should see the **Retry** button for the submission in the **Error** state. Select the **Retry** button to attempt processing your records. If this does not resolve the issue, please reach out to our support team. | +| **Ingestion consumer lost**
Ingestion consumer is used to ingest any payload sent to CluedIn. If the ingestion consumer is lost, it means that CluedIn will not ingest any new records. | CluedIn has a self-healing mechanism that checks if a consumer exists for a queue every 5 minutes. If a consumer is lost, the administrator will receive a notification. | +| **Process consumer lost**
Process consumer is used to process any payload sent to CluedIn. If the process consumer is lost, it means that CluedIn will not process any new records. | CluedIn has a self-healing mechanism that checks if a consumer exists for a queue every 5 minutes. If a consumer is lost, the administrator will receive a notification. | +| **Commit consumer lost**
Commit consumer is used to map any payload sent to CluedIn. If the commit consumer is lost, it means that CluedIn will not map any new records. | CluedIn has a self-healing mechanism that checks if a consumer exists for a queue every 5 minutes. If a consumer is lost, the administrator will receive a notification. | +| **Submission consumer lost**
Submission consumer is used to send records to the processing pipeline. If the submission consumer is lost, it means that CluedIn will not send records to processing, and they will not become golden records. | CluedIn has a self-healing mechanism that checks if a consumer exists for a queue every 5 minutes. If a consumer is lost, the administrator will receive a notification. | +| **Quarantine consumer lost**
Quarantine consumer is used to send records to quarantine. If the quarantine consumer is lost, it means that CluedIn will not send any records to quarantine. | CluedIn has a self-healing mechanism that checks if a consumer exists for a queue every 5 minutes. If a consumer is lost, the administrator will receive a notification. | +| **Submissions stuck**
This can occur either because some messages were manually deleted or because the cluster is down. | If you encounter this error, please reach out to our support team. | diff --git a/docs/040-integration/additional-operations-on-records/080-preview.md b/docs/040-integration/additional-operations-on-records/080-preview.md index 315d1214..72aa7f0a 100644 --- a/docs/040-integration/additional-operations-on-records/080-preview.md +++ b/docs/040-integration/additional-operations-on-records/080-preview.md @@ -5,65 +5,149 @@ parent: Additional operations grand_parent: Integration permalink: /integration/additional-operations-on-records/preview title: Preview -last_modified: 2024-10-25 +last_modified: 2025-04-03 --- ## On this page {: .no_toc .text-delta } - TOC {:toc} -In this article, you will learn about various actions available on the **Preview** tab of the data set to help you analyze and get a better understanding of the uploaded records. +In this article, you will learn about various actions available on the **Preview** tab of the data set to help you analyze the uploaded records and improve their quality before processing. -## Profiling +
+ +
-Profiling provides a breakdown and distribution of all vocabulary key values in a graphic format. It can help you identify issues and anomalies in data quality. Profiling is not limited to a data set, meaning that it displays information about vocabulary key values coming from different data sources. +When you upload the data into CluedIn, it appears on the **Preview** tab that displays the original, raw records. At this stage, you can perform various actions to identify data quality issues and prepare your data set for processing. -Profiling is a beta feature. To access profiling, go to **Administration** > **Feature Flags**, and enable the **Profiling dashboards** feature. +## Search through source records -![preview-profiling-1.png](../../assets/images/integration/additional-operations/preview-profiling-1.png) +You can search through all source records to quickly locate and review particular records. Just enter a key word in the search box and start to search. As a result, all relevant records will be displayed on the page. -You can view profiling at any stage of the data ingestion process: immediately after uploading the data, after creating the mapping, or after processing the data. Profiling is type-specific, so if you change the data type during mapping and then process the data, the profiling dashboard may look different from when you first uploaded the data. For example, if you change the data type for the 'Revenue' column from text to integer, the profiling dashboard will display metrics such as minimum value, maximum value, and average value instead of a breakdown and distribution of text values. +![search-key-word.png](../../assets/images/integration/additional-operations/search-key-word.png) + +If you are working with a data set created from an ingestion endpoint, you can search for records from a specific request send to the endpoint. You can find and copy the request ID on the **Monitoring** tab, and then paste it to the search box on the **Preview** and start to search. As a result, all records that were sent to CluedIn in a specific request will be displayed on the page. For more information about the request ID, see [Monitoring](/integration/additional-operations-on-records/monitoring). + +![search-request-id.png](../../assets/images/integration/additional-operations/search-request-id.png) + +## Analyze source records + +Profiling allows you to analyze the uploaded records, identify issues and anomalies, and ensure data quality. By detecting inconsistencies and errors early, you can take corrective actions to improve the accuracy and reliability of your data. + +On the **Preview** tab, there are two types of profiling: + +- [Profiling for data set](#profiling-for-data-set) + +- [Profiling for golden records](#profiling-for-golden-records) + +### Profiling for data set + +Profiling for the data set allows you to analyze key metrics for each column of the uploaded records and identify data gaps, errors, and overall quality issues. By analyzing data set profiling, you can better understand the source records, evaluate their quality, and make informed decisions for mapping. + +**To view data set profiling** + +- Find the column for which you want to view data set profiling. In the column header, open the three-dot menu, and then and select **View profiling for data set**. + + ![view-profiling-for-data-set.png](../../assets/images/integration/additional-operations/view-profiling-for-data-set.png) + + The **Key metrics** pane opens, where you can view the details about column values. + +The key data set profiling metrics include the following: + +- **Number of values** – the total number of source records that have a value in a given column. + +- **Number of empty values** – the total number of source records that have an empty value in a given column. + +- **Completeness** – the percentage of non-empty or non-null values in a given column. + +- **Incompleteness** – the percentage of empty or null values in a given column. + +- **Cardinality** – the number of unique values in a given column. + +- **Redundancy** – the number that represents the frequency with which unique values are repeated in a given column. + +By analyzing data set profiling metrics, you can quickly identify if a particular column contains empty fields. Once aware of this issue, you can fix it by providing the necessary values as described in [Modify source records](#modify-source-records). In general, a high percentage of incompleteness and a large number of repeated values may indicate that the field is not a good choice for producing [identifiers](/key-terms-and-features/entity-codes). + +### Profiling for golden records + +Profiling for golden records provides a breakdown and distribution of all vocabulary key values in a graphic format. It can help you identify issues and anomalies in data quality. Profiling is not limited to a data set, meaning that it displays information about vocabulary key values coming from different data sources. + +To access profiling for golden records, go to **Administration** > **Feature Flags**, and enable the **Profiling dashboards** feature. + +![profiling-feature-flag.png](../../assets/images/integration/additional-operations/profiling-feature-flag.png) **To view profiling** -- In the column header, open the menu, and select **View profiling**. +- Find the column for which you want to view golden record profiling. In the column header, open the three-dot menu, and then and select **View profiling for golden records**. - ![preview-profiling-2.gif](../../assets/images/integration/additional-operations/preview-profiling-2.gif) + ![view-profiling-for-golden-records.png](../../assets/images/integration/additional-operations/view-profiling-for-golden-records.png) - For more information, see [Vocabulary keys](/management/data-catalog/vocabulary-keys). + The **Profiling** pane opens, where you can view the details about the vocabulary key values in a graphic format. -## Data set filters and operations +The main profiling details for golden records include the following: -Data set filters and operations allow you to view, search, analyze, and modify the data from files. These actions are only available when you switch to the edit mode of the data set. +- **Total values** – the total number of values that the vocabulary key has in the system. -Data set filters and operations is a beta feature. To access data set filters and operations, go to **Administration** > **Feature Flags**, and enable the **Data Set Filters & Operations** feature. +- **Values over time** – a time-series visualization displaying the count of values that appeared in CluedIn over time. The x axis represents the time, and the y axis represents the count of values. For a more granular view, you can select a specific time period using the mouse. -![preview-filters-and-operations-1.png](../../assets/images/integration/additional-operations/preview-filters-and-operations-1.png) +- **Distribution of values (bar gauge)** – the distribution of vocabulary key values based on the number of records where each value is used. Each bar represents a distinct value, with the color gradient indicating the frequency of that value’s occurrence in the records: green indicates a lower number of records, while red indicates a higher number of records. This gradient provides a quick visual cue for identifying the most and least common values. + +- **Distribution of values (pie chart)** – the distribution of vocabulary key values based on the number of records where each value is used. Each slice of the pie represents a distinct key value, with the area of the slice indicating the frequency of that value’s occurrence in the records. The larger the slice, the higher the number of records that contain the value. This type of visualization helps to quickly understand the relative proportions of different vocabulary key values. -Data set filters and operations are available only in data sets created from files. They are not available in data sets created from endpoints or databases. +You can view profiling for golden records at any stage of the data ingestion process: immediately after uploading the data, after creating the mapping, or after processing the data. Profiling is type-specific, so if you change the data type during mapping and then process the data, the profiling dashboard may look different from when you first uploaded the data. For example, if you change the data type for the Revenue column from text to integer, the profiling dashboard will display metrics such as minimum value, maximum value, and average value instead of a breakdown and distribution of text values. You can also view profiling on the [vocabulary key page](/management/data-catalog/vocabulary-keys). -### Switch to edit mode +### Duplicates -Switching to the edit mode allows you to use filters, modify the data set, and perform various operations. +Once you have created the mapping for the data set, you can check if a specific column contains duplicate values. + +**To view duplicates** + +- Find the column for which you want to view duplicates. In the column header, open the three-dot menu, and then and select **View duplicates**. + + ![view-duplicates.png](../../assets/images/integration/additional-operations/view-duplicates.png) + + The **Duplicates preview** pane opens, where you can view the total number of duplicate values in the column (a), identify which values are duplicates (b), and see how many times each duplicate value occurs (c). + + ![duplicates-preview.png](../../assets/images/integration/additional-operations/duplicates-preview.png) + +## Modify source records + +If you identify data gaps, errors, and overall quality issues in your source records and want to fix them before processing, you can do so in the edit mode of the data set. The edit mode allows you to perform various operations and validations on the source records to address these issues. The edit mode is available in data sets created from files and endpoints. {:.important} When you switch to the edit mode, the original data set is cloned, so you can revert back to it at any time. However, reverting to the original data set means losing all changes made in the edit mode. **To switch to the edit mode** -- Near the upper-right corner of the data set, open the three-dot menu, and then select **Switch to edit mode**. Then, confirm that you want to switch to the edit mode. +1. In the upper-right corner of the data set page, select **Switch to edit mode**. + + ![switch-to-edit-mode.png](../../assets/images/integration/additional-operations/switch-to-edit-mode.png) - ![Switch_to_edit_mode.png](../../assets/images/integration/additional-operations/Switch_to_edit_mode.png) +1. Review the instructions about the edit mode, and then confirm that you want to switch to the edit mode. Depending on the size of your data set, switching to the edit mode might take some time. You'll receive a notification when the data set has been switched to the edit mode. -If you decide to go back to the original data set, exit the edit mode. To do that, near the upper-right corner of the data set, open the three-dot menu, and then select **Switch to original**. Then, confirm that you want to switch to the original data set. + If you switch to edit mode for a data set that has already been processed, keep in mind that editing and re-processing this data set might lead to changes in primary identifiers. For example, if the primary identifiers are auto-generated, re-processing the data set will produce new golden records. + +If you decide to go back to the original data set, exit the edit mode. To do that, in the upper-right corner of the data set page, select **Switch to original**. Then, confirm that you want to switch to the original data set. If you made any changes in the edit mode, they will not be available in the original data set. + +When your data set is in the edit mode, you can do the following actions: + +- [Filter records](#filter-records) + +- [Edit values manually](#edit-values-manually) + +- [Transform values using operations](#transform-values-using-operations) + +- [Add columns](#add-columns) + +- [Remove source records](#remove-source-records) ### Filter records -When the data set is in the edit mode, you can apply the following filters to any column: +Filters help you quickly find specific source records, identify records with empty values, and view column value aggregation. You can apply the following filters to any column: -- **Search** – select or enter a specific value to display all records containing that value. +- **Search** – select or enter a specific value to search for all records containing that value. You can choose to display the results that precisely match the entered value (**Precise match**) or those that exclude it (**Inverse**). - **Is empty** – display all records where the column contains empty values. @@ -71,121 +155,155 @@ When the data set is in the edit mode, you can apply the following filters to an - **Aggregation** – consolidate and summarize all values contained in the column. +- **Invalid fields** – display all records that contain invalid fields according to column validation result. If no validation is applied to the column, this filtering option is unavailable. Learn more about validation in [Source record validation](/integration/additional-operations-on-records/source-record-validation). + +- **Valid fields** – display all records that contain valid fields according to column validation result. If no validation is applied to the column, this filtering option is unavailable. Learn more about validation in [Source record validation](/integration/additional-operations-on-records/source-record-validation). + +You can use filters when you need to find incorrect or empty values. Once you find them, you can fix them by [editing the values manually](#edit-values-manually). + **To apply a filter** -- In the column header, open the three-dot menu, and then select **Filter**. Then, select the needed filtering option. +- Find the column you want to filter. In the column header, open the three-dot menu, select **Filter**, and then select the needed filtering option. - ![apply-filter.gif](../../assets/images/integration/additional-operations/apply-filter.gif) + ![filter-options.png](../../assets/images/integration/additional-operations/filter-options.png) - All applied filters are displayed on the **Filters** pane. If you don't need a filter temporarily, you can disable it and enable it again when needed. If you no longer need a filter, you can delete it by selecting the delete icon. + As a result, the records matching the filter are displayed on the page. All applied filters are listed on the **Filters** pane. -### Perform operations + ![filters-pane.png](../../assets/images/integration/additional-operations/filters-pane.png) -When the data set is in the edit mode, you can perform various transformations on the values in any column. For example, you can transform the values to upper case, remove extra spaces, replace one value with another, and more. The purpose of operations is to help you transform and improve the contents of columns automatically and efficiently. + If you don't need a filter temporarily, you can disable it and enable it again when needed. If you no longer need a filter, you can delete it by selecting the delete icon. -The following table shows all available operations and their description. +### Edit values manually -| Operation | Description | -|--|--| -| Camel case | Transforms all values in the column to camel case (directorOfSales). | -| Capitalize | Changes the first letter of the first word to uppercase (Director of sales). | -| Decapitalize | Changes the first letter of the first word to lowercase (director of sales). | -| Auto-generated value | Replaces values with automatically created unique values. | -| Kebab case | Transforms all values in the column to kebab case (director-of-sales). | -| Lower case | Transforms all values in the column to lowercase (director of sales). | -| Set value | Replaces all values in the column with the specified value. | -| Slugify | Converts a string into a lowercase, hyphen-separated version of the string, with all special characters removed. | -| Swap case | Invert the case of each letter in a string. This means that all uppercase letters are converted to lowercase, and all lowercase letters are converted to uppercase. | -| Trim | Removes all leading and trailing whitespace from a string. | -| Trim Left | Removes only the leading whitespace from a string. | -| Trim Right | Removes only the trailing whitespace from a string. | -| Upper case | Converts all letters of each word to uppercase (DIRECTOR OF SALES). | -| Title case | Converts the first letter of each word to uppercase (Director Of Sales). | -| Keep only numbers | Extracts and retains only the numeric characters from a string, removing any non-numeric characters. | -| To Boolean | Converts a value to a Boolean data type, which can be either true or false. | -| Replace | Replaces the first occurrence of a specified pattern within a string with another character or sequence of characters. | -| Replace all | Replaces all occurrences of a specified pattern within a string with another character or sequence of characters. | -| Replace spaces | Replaces spaces in a string with another character or sequence of characters. | -| Replace character(s) with space | Replaces specified characters with a space. | +If you notice that your source records contain invalid values, missing values, or other issues and want to fix them before processing, you can do it by modifying the values manually. This way, you can fix incorrect spelling, provide missing values, or make any other changes. -**To perform an operation** +**To edit a value manually** -- In the column header, open the three-dot menu, and then select **Operations**. Then, select the needed operation. +1. Find a record containing a value that you want to edit. To quickly find the needed record, use [filters](#filter-records). - ![perform-operation.gif](../../assets/images/integration/additional-operations/perform-operation.gif) +1. Click on the cell containing a value that you want to edit and make the needed changes. The edited value is formatted in bold. - All applied operations are displayed on the **Operations** pane. If you no longer need the change or you made it by mistake, you can revert the change. To do it, select the **Undo the last operation** icon or the delete icon. Note that changes can only be reverted consecutively, one by one, and not selectively. + ![edit-values-manually.png](../../assets/images/integration/additional-operations/edit-values-manually.png) -### Modify data manually +1. Select **Save**. The bold formatting of the changed value disappears. -When the data set is in the edit mode, you can manually modify the data directly in the cells. + The history of your manual changes is displayed on the **Operations** tab. -**To modify data manually** + ![edit-values-manually-operations.png](../../assets/images/integration/additional-operations/edit-values-manually-operations.png) -1. Click on the value that you want to change and make the needed changes. The edited value will be displayed in bold. + If you no longer need the change or you made it by mistake, you can revert the change. To do it, select the delete icon next to the operation name or the revert icon in the upper-right corner. Note that changes can only be reverted consecutively, one by one, and not selectively. -1. Select **Save**. The bold formatting of the changed value disappears. +### Transform values using operations + +You can transform and improve the contents of columns automatically and efficiently using operations. For example, you can transform the values to upper case, remove extra spaces, replace one value with another, and more. The following table shows all available operations and their description. + +| Operation | Description | +|--|--| +| Camel case | Transform all values in the column to camel case (directorOfSales). | +| Capitalize | Change the first letter of the first word to uppercase (Director of sales). | +| Decapitalize | Change the first letter of the first word to lowercase (director of sales). | +| Auto-generated value | Replace values with automatically created unique values. | +| Kebab case | Transform all values in the column to kebab case (director-of-sales). | +| Lower case | Transform all values in the column to lowercase (director of sales). | +| Set value | Replace all values in the column with the specified value. | +| Slugify | Convert a string into a lowercase, hyphen-separated version of the string, with all special characters removed. | +| Swap case | Invert the case of each letter in a string. This means that all uppercase letters are converted to lowercase, and all lowercase letters are converted to uppercase. | +| Trim | Remove all leading and trailing whitespace from a string. | +| Trim Left | Remove only the leading whitespace from a string. | +| Trim Right | Remove only the trailing whitespace from a string. | +| Upper case | Convert all letters of each word to uppercase (DIRECTOR OF SALES). | +| Title case | Convert the first letter of each word to uppercase (Director Of Sales). | +| Keep only numbers | Extract and retain only the numeric characters from a string, removing any non-numeric characters. | +| To Boolean | Convert a value to a Boolean data type, which can be either true or false. | +| Replace | Replace the first occurrence of a specified pattern within a string with another character or sequence of characters. | +| Replace all | Replace all occurrences of a specified pattern within a string with another character or sequence of characters. | +| Replace spaces | Replace spaces in a string with another character or sequence of characters. | +| Replace character(s) with space | Replace specified characters with a space. | - ![modify-data-manually.gif](../../assets/images/integration/additional-operations/modify-data-manually.gif) +**To transform values in a column using an operation** - The history of your manual changes is displayed on the **Operations** tab. If you no longer need the change or you made it by mistake, you can revert the change. To do it, select the **Undo the last operation** icon or the delete icon. Note that changes can only be reverted consecutively, one by one, and not selectively. +- Find the column containing values that you want to transform. In the column header, open the three-dot menu, select **Operations**, and then select the needed operation. + + ![operations-options.png](../../assets/images/integration/additional-operations/operations-options.png) + + As a result, the values in the column are automatically transformed. All applied operations are displayed on the **Operations** pane. If you no longer need the change or you made it by mistake, you can revert the change. To do it, select the delete icon next to the operation name or the revert icon in the upper-right corner. Note that changes can only be reverted consecutively, one by one, and not selectively. ### Add columns -When the data set is in the edit mode, you can add new columns to the data set. +You can add new columns to your dataset. This is useful for combining values from other columns or creating an empty column for new values. There are two types of columns that you can add: -**To add a column to the data set** +- **Stored column** – the values are generated only at the column creation time. If new records appear in the data set afterwards, the stored column in such records will contain empty values. Stored column may be a good choice when working with a data set created from a file, which will not change over time. -1. Select **Add column**. +- **Computed column** – the values are combined from already existing columns into one new column. Computed column may be a good choice when working with a data set created from an endpoint, which will receive new records over time. -1. Enter the column name. +**To add a column to the data set** -1. Select the column type: +1. Select **Add column**. - - Stored column – generate the values for the column: empty values that can modified later, values based on other existing fields in the data set, or values based on the low-code approach expression. +1. Enter the column name and select the column type. - - Computed column – combine the values from two already existing columns into one new column. + ![add-column-type.png](../../assets/images/integration/additional-operations/add-column-type.png) 1. Select **Next**. -1. Choose an option for generating the values for the column fields: +1. Choose an option for generating the values for the column: + + - (Stored column) **Empty** – a new column with empty fields will be added to the data set. - - (Stored column) Empty – a new column with empty fields will be added to the data set. + - (Stored column or computed column) **From existing fields** – select the fields from the data set that you want to combine to create values for a new column. You can add multiple fields. By default, the values are separated with a space, but you can enter another delimiter if needed. - - (Stored column or computed column) From existing fields – select the fields from the data set that you want to combine to create values for a new column. You can add multiple fields. By default, the values are separated with a space, but you can enter another delimiter if needed. + - (Stored column or computed column) **Expression** – enter a C.E.L supported expression to create values for a new column. - - (Stored column or computed column) Expression – enter a C.E.L supported expression to create values for a new column. + ![add-column-configuration.png](../../assets/images/integration/additional-operations/add-column-configuration.png) 1. Select **Save**. - ![add-column.gif](../../assets/images/integration/additional-operations/add-column.gif) + The new column is added to the data set and is marked with the information icon. If you no longer need the column, you can delete it. To delete the computed column, open the three-dot menu in the column header, select **Delete computed field**, and then confirm your choice. To delete the stored column, open the **Operations** pane, and then select the delete icon for the corresponding operation or the revert icon in the upper-right corner. Note that operations can only be reverted consecutively, one by one, and not selectively. - The new column is added to the data set and is marked with the information icon. If you no longer need the column, you can delete it. To do it, in the column header, open the three-dot menu, select **Delete computed field**, and then confirm your choice. +### Remove source records -## View duplicates +If you do not need specific source records, you can remove them from the data set. -This feature allows you to check if the column contains duplicate values. Viewing duplicates in a column is available only after the mapping for the data set has been created. +**To remove source records** -**To view duplicates** + 1. Select the checkbox next to the source record that you want to remove. + + ![remove-source-records.png](../../assets/images/integration/additional-operations/remove-source-records.png) + +1. Select the delete icon. + +1. In the confirmation dialog, enter _DELETE_, and then select **Confirm**. + + As a result, the records are removed from the data set. This change is displayed on the **Operations** tab. If you removed the records by mistake, you can revert the change. To do it, select the delete icon for the corresponding operation or the revert icon in the upper-right corner. Note that changes can only be reverted consecutively, one by one, and not selectively. + +## Additional actions + +Apart from analyzing and modifying source records, you can perform additional actions on the **Preview** tab. These actions depend on the data source type. + +### File: Download original file + +If you are working with a data set created from a file, you can download the original file. This is useful if you have modified the source records in the edit mode and want to keep the original file for reference. + +**To download the original file** -1. Open the menu in the column heading, and then select **View duplicates**. +1. Near the sorting dropdown list, open the three-dot menu, and then select **Download original file**. - ![preview-duplicates-1.gif](../../assets/images/integration/additional-operations/preview-duplicates-1.gif) + ![download-original-file.png](../../assets/images/integration/additional-operations/download-original-file.png) - You can view the total number of duplicate values in the column, identify which values are duplicates, and see how many times each duplicate value occurs. + As a result, the original file is downloaded to your computer. -## Clear records +### Endpoint: Clear records -This feature is available only for data sets created using an ingestion endpoint. It allows you to delete records from the **Preview** tab. This is useful when you send many requests to the same ingestion endpoint and want to avoid processing each record every time. When processing records, CluedIn checks if they have been processed before. If they have, they won't be processed again. To reduce processing time, you can delete already processed records from the **Preview** tab. +If you are working with a data set created from an endpoint, you can delete the records from temporary storage on the **Preview** tab. This is useful if you send many requests to the same endpoint and want to avoid processing each record every time. When processing records, CluedIn checks if they have been processed before. If they have, they won't be processed again. To reduce processing time, you can delete already processed records from the **Preview** tab. You can clear the records regardless of whether the records have been processed or not, but if you haven't processed the records, they will be permanently deleted. -**To clear the content** +**To clear records** -1. Near the upper-right corner of the table, select the vertical ellipsis button, and then select **Clear records**. +1. Near the sorting dropdown list, open the three-dot menu, and then select **Clear records**. -1. Confirm that you want to delete the records by entering _DELETE_. Then, select **Confirm**. + ![clear-records.png](../../assets/images/integration/additional-operations/clear-records.png) - ![clear-content.gif](../../assets/images/integration/additional-operations/clear-content.gif) +1. In the confirmation dialog, enter _DELETE_, and then select **Confirm**. - After records are deleted, you can send more data to the ingestion endpoint. \ No newline at end of file + As a result, all records from the **Preview** tab are deleted, and you can send more data to the endpoint. \ No newline at end of file diff --git a/docs/040-integration/additional-operations-on-records/090-validations.md b/docs/040-integration/additional-operations-on-records/090-validations.md new file mode 100644 index 00000000..5960f1f3 --- /dev/null +++ b/docs/040-integration/additional-operations-on-records/090-validations.md @@ -0,0 +1,232 @@ +--- +layout: cluedin +nav_order: 9 +parent: Additional operations +grand_parent: Integration +permalink: /integration/additional-operations-on-records/validations +title: Validations +last_modified: 2025-04-03 +--- +## On this page +{: .no_toc .text-delta } +- TOC +{:toc} + +In this article, you will learn how to check the source records for errors, inconsistencies, and missing values, as well as how to fix invalid values with the help of validations. As a result, you can improve the quality of source records and prevent incorrect records from becoming golden records or aggregating to the existing golden records. + +
+ +
+ +## Perform validations + +To perform source record validations, you need to meet the following prerequisites: + +- The data set should be **mapped** to standard fields. This way CluedIn can analyze your data and suggest appropriate validation methods. Learn more in [Create mapping](/integration/create-mapping) and [Review mapping](/integration/review-mapping). + +- The data set should be in the **edit mode**. This way you can add validations and modify source records. Learn more in [Analysis and modification](/integration/additional-operations-on-records/preview). + +**To access validations** + +- On the data set page, select **Validations**. + + ![access-validations.png](../../assets/images/integration/additional-operations/access-validations.png) + + The pane containing **Validations** opens to the right side of the page. + +When you access validations for the first time or when you reset validations, you will see the initial validation setup options: + +- **Auto-validation** – CluedIn will analyze the fields and suggest appropriate validation methods for some fields. + +- **Manual setup** – you need to select appropriate validation methods for the fields that you want to validate. + +You can start with auto-validation and then add manual validations for the needed fields. + +### Auto-validation + +Auto-validation is a good starting point for finding invalid values. + +**To run auto-validation** + +1. On the **Validations** pane, in the **Initial Validation Setup** section, make sure that the **Auto-validation** option is selected. + + ![validations-pane.png](../../assets/images/integration/additional-operations/validations-pane.png) + +1. In the **Validation Preview** section, review the fields along with validation methods suggested by CluedIn. + +1. Select **Validate**. + + CluedIn will run validations on the specific fields. When the validations are complete, you will see the [validation results](#review-validation-results) for each field. + + ![validations-result.png](../../assets/images/integration/additional-operations/validations-result.png) + + The fields that failed the validation check are highlighted in red. Now, you can process the results of auto-validation and [fix invalid values](#fix-invalid-values). + +### Manual validation + +If auto-validation is not sufficient for you or if you want to apply different validation methods, use manual validation. + +**To add manual validation** + +1. Depending on whether you have already run auto-validation, do one of the following: + + - If you have already run auto-validation, then expand the filter dropdown, and select **Show all fields**. + + ![validations-filter.png](../../assets/images/integration/additional-operations/validations-filter.png) + + - If you have not run auto-validation, then in the **Initial Validation Setup** section, select the **Manual Setup** option, and then select **Validate**. + +1. From the list of all fields, find the field for which you want to add manual validation. To quickly find the needed field, start entering its name in the search field. + + ![validations-search.png](../../assets/images/integration/additional-operations/validations-search.png) + +1. In the **Validation Method** dropdown, select the [validation method](#validation-methods) that you want to apply to the field. Depending on the selected validation method, you might need to provide additional configuration details. + + For example, the following screenshot shows the validation method that checks the Job Title field for empty values. The **Inverse** toggle is turned on, indicating that if the field contains an empty string, it will be marked as invalid. + + ![validations-manual-setup.png](../../assets/images/integration/additional-operations/validations-manual-setup.png) + +1. When the validation for a field is configured, select **Validate**. + + CluedIn will run validation on the specific field. When the validation is complete, you will see the [validation results](#review-validation-results) for the field. + +1. To add manual validations for other fields, repeat steps 2–4. + + The fields that failed the validation check are highlighted in red. Now, you can process the results of manual validation and [fix invalid values](#fix-invalid-values). + +### Advanced validation + +The previous validation options—auto-validation and manual validation—focus on the field-level validation. If you need to implement complex business logic to check for invalid records, use advanced validation. + +**To add advanced validation** + +1. On the **Validations** tab, select **Advanced Validation**. + + ![validations-advanced-validations.png](../../assets/images/integration/additional-operations/validations-advanced-validations.png) + +1. Select **Run** to load all clues that were created from the data set. The clues appear on the right side of the page. + +1. On the left side of the page, write the code to create source record validation logic. You can write any JavaScript code. + + ![source-record-validations-advanced-code.png](../../assets/images/integration/additional-operations/source-record-validations-advanced-code.png) + + In this example, we wrote the code that checks the Job Title and Department values, and if the Job Title is Accountant and Department is Marketing, then the whole record will be marked as invalid. + +1. To check if the code is applied as intended, select **Run**. As a result, if the record contains the specified combination of values, the **isValid** property is set to **false**. + + ![source-record-validations-advanced-result.png](../../assets/images/integration/additional-operations/source-record-validations-advanced-result.png) + +1. If you are satisfied with the result, select **Save**. To return to the data set, select **Back**. + + The source records that failed the advanced validation check are highlighted in red. Now, you can [fix invalid values](#fix-invalid-values). + +## Manage validations + +Once you have added validations, there are a number of actions you can take to manage these validations: + +1. **Refresh** – re-run the validation check for a field. + +1. **Remove** – remove the validation check for a field. + +1. **Edit** – edit the configuration of the validation for a field. You can select another [validation method](#validation-methods) and modify the configuration details. After you finish, select **Save & Validate**. As a result, the validation check for the field is run again. + + ![validations-refresh-delete-edit.png](../../assets/images/integration/additional-operations/validations-refresh-delete-edit.png) + +1. **Filter fields** – select between all fields, validated fields, or non-validated fields. + +1. **Sort fields** – sort the fields displayed on the pane: by name, by newest, by oldest. + +1. **Reset field filters** – resets the filters to show all fields. + + ![validations-filters-sorting-reset.png](../../assets/images/integration/additional-operations/validations-filters-sorting-reset.png) + +1. **Reset validation methods** – removes all validation methods so that you can start adding them from scratch. + + ![validations-reset-methods.png](../../assets/images/integration/additional-operations/validations-reset-methods.png) + +## Process validation results + +After performing validations, you can start reviewing validation results and fixing invalid values. + +### Review validation results + +You can review validation results in two places: on the **Validations** pane and on the data set page. + +**Validation results on the Validations pane** + +On the **Validations** pane, you can view validations results for each field. These results include the validation method as well as the total number of values, the number of valid values, and the number of invalid values. You can select the number of invalid values for a field to filter the records displayed on the page. Additionally, the status bar shows the percentage of valid values for a field. + +![validations-result.png](../../assets/images/integration/additional-operations/validations-result.png) + +**Validation results on the data set page** + +On the data set page, you can view validation results for each column. Hover over the status bar at the bottom of the column header and you will see the validation results for that column (a). These results are the same as on the **Validations** pane. Additionally, the status bar under the column header shows the correlation between valid (green) and invalid (red) values in the column. You can also view the general statistics of valid values in the data set (b). + +![validations-column-hover.png](../../assets/images/integration/additional-operations/validations-column-hover.png) + +### Fix invalid values + +After running the validation checks, you can start reviewing and fixing invalid values. You are not required to fix invalid values—when you process the data set, the records containing invalid values will be processed in the same way as all the other records and they will not be automatically sent to the [quarantine](/integration/additional-operations-on-records/quarantine) or for approval area unless there are specific rules. + +**To find and fix invalid values** + +1. To find invalid values, use one of the following options: + + - In the **Validations** pane, locate the field for which you want to view invalid values, and then select the number of invalid values. + + - On the data set page, locate the column for which you want to view invalid values. Then, open the three-dot menu in the column header, and select **Filter** > **Invalid Fields**. + + Regardless of the option that you use, the filter for invalid values is added to the **Filters** pane. As a result, the data set page displays the records containing invalid values. These invalid values are highlighted in red. + + ![validations-invalid-values.png](../../assets/images/integration/additional-operations/validations-invalid-values.png) + +1. To fix invalid values, click on the cell containing the invalid value and modify it accordingly. The modified value is marked in bold. Once the value is correct, it becomes highlighted in green. + + ![validations-invalid-values-fixed.png](../../assets/images/integration/additional-operations/validations-invalid-values-fixed.png) + +1. Select **Save**. The records containing fixed values disappear from the data set page because they no longer meet the filter criteria for showing invalid values. + + The modifications that you make to the source records are added to the **Operations** pane every time you save changes. + + ![validations-operations.png](../../assets/images/integration/additional-operations/validations-operations.png) + + If you want to revert changes, select the delete icon next to the operation name or the revert icon in the upper-right corner. Note that changes can only be reverted consecutively, one by one, and not selectively. + +## Validation methods + +| Method | Description | +|--|--| +| Range (<%=min> to <%=max%>) | Check if a value falls within a specified range. You need to provide the min and max values. | +| Is valid email address | Check if an email address is correctly formatted and potentially valid. This validation method ensures the email address follows standard formatting rules, such as having an “@” symbol and a valid domain name (e.g., example@domain.com). | +| Is empty string | Check if a value is either completely empty or contains no characters. | +| Is equal to | Check if a value is equal to a specific value. | +| Is equal to Multiple (AND) | Check if a value meets all of the specified values using the logical AND operator. | +| Is equal to Multiple (OR) | Check if a value meets at least one of multiple specified values using the logical OR operator. | +| Is a number | Check if a value is a valid number. | +| Regex - matches/<%=pattern%> | Check if a value matches a regex pattern. You need to provide the regex pattern. | +| Is URL | Check if a value is a valid URL (uniform resource locator). | +| Ip Address V4 | Check if a value is a valid IPv4 address. An IPv4 address consists of four numerical segments separated by dots, with each segment ranging from 0 to 255 (for example, `192.168.1.1`). | +| Ip Address V6 | Check if a value is a valid IPv6 address. An IPv6 address consists of eight groups of four hexadecimal digits, separated by colons (for example, `2001:0db8:85a3:0000:0000:8a2e:0370:7334`). | +| Is UUID | Check if a value is a valid UUID (universally unique identifier). | +| Is Credit Card | Check if a value is a valid credit card number. | +| Is Boolean | Check if value is a valid Boolean (either `true` or `false`). | +| Is Currency | Check if a value is a valid ISO 4217 currency code. | +| Is ISO31661 Alpha2 | Check if a value is a valid ISO 3166-1 alpha-2 country code. ISO 3166-1 alpha-2 codes are two-letter codes used to represent countries, dependent territories, and special areas of geographical interest. | +| Is ISO31661 Alpha3 | Check if a value is a valid ISO 3166-1 alpha-3 country code. ISO 3166-1 alpha-3 codes are three-letter codes used to represent countries, dependent territories, and special areas of geographical interest. | +| Is gender Facebook | Check if a gender value falls within the list of gender options available on Facebook. | +| Is Bank Card Type | Check if a value corresponds to a recognized bank card type, such as American Express, Bankcard, China UnionPay, Diners Club Carte Blanche, Diners Club enRoute, Diners Club International, Diners Club United States & Canada, InstaPayment, JCB, Laser, Maestro, Mastercard, Solo, Switch, Visa, Visa Electron. | +| Is gender Abbreviation | Check if a given value is a valid gender abbreviation, such as "M" for Male, "F" for Female, and "X" for non-binary or other gender identities. | +| Is Integer | Check if a value is a valid integer. | +| Is Age | Check if a value is a valid age between 0 and 199. | + +Regardless of the validation method, the validation for a field contains two additional settings: + +- **Required** – this setting defines whether a field is required to have a value. This setting is particularly useful when you are checking for empty fields. If this is your case, make sure you select the checkbox. If you are using other validation methods, it does not matter if the checkbox is selected or not because the validation will be performed based on the existing values, and empty fields will not be marked as invalid. + +- **Inverse** – this setting defines whether a field is marked valid or invalid based on the validation method. + + When the toggle is turned off, the value is marked as invalid if it does not meet the condition expressed in the validation method; if the value meets the condition expressed in the validation method, then it is marked as valid. + + When the toggle is turned on, the value is marked as invalid if it meets the condition expressed in the validation method; if the value does not meet the condition expressed in the validation method, it is marked as valid. + + For example, if you want to mark all empty fields as invalid, use the **Is empty string** validation method, select the **Required** checkbox, and turn on the **Inverse** toggle. \ No newline at end of file diff --git a/docs/040-integration/additional-operations-on-records/070-remove-records.md b/docs/040-integration/additional-operations-on-records/100-remove-records.md similarity index 99% rename from docs/040-integration/additional-operations-on-records/070-remove-records.md rename to docs/040-integration/additional-operations-on-records/100-remove-records.md index 67499a28..dfbeefe2 100644 --- a/docs/040-integration/additional-operations-on-records/070-remove-records.md +++ b/docs/040-integration/additional-operations-on-records/100-remove-records.md @@ -1,10 +1,10 @@ --- layout: cluedin -nav_order: 7 +nav_order: 10 parent: Additional operations grand_parent: Integration permalink: /integration/additional-operations-on-records/remove-records -title: Remove records +title: Removal of records last_modified: 2024-08-26 --- diff --git a/docs/040-integration/crawlers/080-build-integration.md b/docs/040-integration/crawlers/080-build-integration.md index 62b1c73e..7775b5a1 100644 --- a/docs/040-integration/crawlers/080-build-integration.md +++ b/docs/040-integration/crawlers/080-build-integration.md @@ -90,7 +90,7 @@ The following is the minimal steps required to replicate the _Hello World_ examp __'.___.'__ ´ ` |° ´ Y ` ? What is the model name? User - ? What is the entity type? Person + ? What is the business domain? Person ? Enter a comma separated list of properties to add to the model id,name,username,email ? Choose the visibility for key: id(undefined) Visible ? Choose the type for key id Integer diff --git a/docs/040-integration/data-sources/150-endpoint.md b/docs/040-integration/data-sources/150-endpoint.md index 2bf43c44..99d78dc5 100644 --- a/docs/040-integration/data-sources/150-endpoint.md +++ b/docs/040-integration/data-sources/150-endpoint.md @@ -83,7 +83,7 @@ An ingestion endpoint is a channel through which CluedIn can receive data from e 1. Select the **Mapping configuration** option: - - **New mapping** – you can create a new mapping for the data set. If you choose this option, you need to select the existing entity type or create a new one. If you create a new entity type, select an icon to visually represent the entity type. + - **New mapping** – you can create a new mapping for the data set. If you choose this option, you need to select the existing business domain or create a new one. If you create a new business domain, select an icon to visually represent the business domain. - **Existing mapping** – you can reuse the mapping from the data set that has the same structure. If you choose this option, you need to indicate the data set with the required mapping configuration. To do that, choose the following items one by one: a data source group, a data source, and a data set. diff --git a/docs/040-integration/data-sources/170-create-mapping.md b/docs/040-integration/data-sources/170-create-mapping.md index 6ee8c723..02080379 100644 --- a/docs/040-integration/data-sources/170-create-mapping.md +++ b/docs/040-integration/data-sources/170-create-mapping.md @@ -90,7 +90,7 @@ Manual mapping gives you full control over how each field for your data set will **To configure manual mapping** -1. Choose the existing entity type or create a new one. If you create a new entity type, select an icon to visually represent the entity type. +1. Choose the existing business domain or create a new one. If you create a new business domain, select an icon to visually represent the business domain. 1. Choose the existing vocabulary or create a new one. @@ -138,7 +138,7 @@ Auto mapping tries to detect unique codes and map original columns to the most a **To configure auto mapping** -1. Choose the existing entity type or create a new one. If you create a new entity type, select an icon to visually represent the entity type. +1. Choose the existing business domain or create a new one. If you create a new business domain, select an icon to visually represent the business domain. 1. Choose the existing vocabulary or create a new one. @@ -171,7 +171,7 @@ To use AI capabilities to create mapping, first complete all the steps described AI mapping analyzes your data set and suggests the following details for your mapping: -- Entity type and vocabulary. +- Business domain and vocabulary. - Origin – field used to produce the primary unique identifier. diff --git a/docs/040-integration/data-sources/180-review-mapping.md b/docs/040-integration/data-sources/180-review-mapping.md index 06866313..61a60037 100644 --- a/docs/040-integration/data-sources/180-review-mapping.md +++ b/docs/040-integration/data-sources/180-review-mapping.md @@ -17,13 +17,14 @@ After the mapping is created, review the mapping details to make sure that your ![review-mapping-1.png](../../assets/images/integration/data-sources/review-mapping-1.png) -**Important!** Before the data is processed, your mapping changes won't affect the existing records in CluedIn. +{:.important} +Before the data is processed, your mapping changes won't affect the existing records in CluedIn. To open the mapping details, on the **Map** tab of the data set, select **Edit mapping**. You'll see three tabs containing all mapping details: - [Map columns to vocabulary key](#properties) – here you can check which properties will be sent to CluedIn after processing. -- [Map entity](#codes) – here you can check the general details of the records that will be created after processing and the identifiers that will uniquely represent the records. +- [Map entity](#identifiers) – here you can check the general details of the records that will be created after processing and the identifiers that will uniquely represent the records. - [Add edge relations](#relationships) – here you can create rules for establishing the relationships between golden records. @@ -37,58 +38,58 @@ On the **Map columns to vocabulary key** tab, check how the original fields will - Add [property rules](/integration/additional-operations-on-records/property-rules) to improve the quality of mapped records by normalizing and transforming property values. -### Codes +### Identifiers -On the **Map entity** tab, review the general mapping details and check the identifiers that will uniquely represent the records in CluedIn—**entity origin code** and **entity codes**. +On the **Map entity** tab, review the general mapping details and check the identifiers that will uniquely represent the records in CluedIn—**primary identifier** and **identifiers**. **What are general details?** -- Entity type and vocabulary. +- Business domain and vocabulary. - Entity name – name of the records that is displayed on the search results page and on the record details page. - Preview image, description, date created, and date modified – record properties that you can find on the search results page and on the record details page. You can select which column should be used for each of these settings. -**What is an entity origin code?** +**What is a primary identifier?** -An entity origin code is a primary unique identifier of the record in CluedIn. If the entity origin codes are identical, the records will be merged. This merging is faster than creating a deduplication project because it is done on the fly and is based on strict equality matching. The deduplication project, on the other hand, is based on fuzzy matching and requires you to define matching criteria, making it a more time-consuming process. Even if you prefer to merge records by running a deduplication project, merging by entity origin codes produces cleaner, "pre-merged" records. As a result, the deduplication project will generate better results and be more performant. +A primary identifier is a unique identifier of the record in CluedIn. If the primary identifiers are identical, the records will be merged. This merging is faster than creating a deduplication project because it is done on the fly and is based on strict equality matching. The deduplication project, on the other hand, is based on fuzzy matching and requires you to define matching criteria, making it a more time-consuming process. Even if you prefer to merge records by running a deduplication project, merging by primary identifiers produces cleaner, "pre-merged" records. As a result, the deduplication project will generate better results and be more performant. -**Options for generating the entity origin code** +**Options for generating the primary identifier** -Depending on how unique you consider the records to be, you can choose one of the following options for generating the entity origin code: +Depending on how unique you consider the records to be, you can choose one of the following options for generating the primary identifier: -- **Single key** – CluedIn will generate unique entity origin codes for the records based on the selected property. This is the most commonly used option because data often already contains unique identifiers (for example, GUIDs) from the source systems. +- **Single key** – CluedIn will generate unique primary identifiers for the records based on the selected property. This is the most commonly used option because data often already contains unique identifiers (for example, GUIDs) from the source systems. -- **Auto-generated key** – CluedIn will generate unique entity origin codes for the records. Choosing this option may lead to an increased number of duplicates in the system, but you can mitigate this by running a deduplication project afterwards. +- **Auto-generated key** – CluedIn will generate unique primary identifiers for the records. Choosing this option may lead to an increased number of duplicates in the system, but you can mitigate this by running a deduplication project afterwards. -- **Compound key** – CluedIn will generate unique entity origin codes for the records by combining selected properties. Choose this option if you are confident that the data structure won't change in the future. For example, you can select the following properties to generate a compound key: First Name, Last Name, City, Address Line 1, and Country. +- **Compound key** – CluedIn will generate unique primary identifiers for the records by combining selected properties. Choose this option if you are confident that the data structure won't change in the future. For example, you can select the following properties to generate a compound key: First Name, Last Name, City, Address Line 1, and Country. -The following diagram will help you in determining which option to use for generating the entity origin code. +The following diagram will help you in determining which option to use for generating the primary identifier. ![review-mapping-2.png](../../assets/images/integration/data-sources/review-mapping-2.png) **Example** -We ingested a file with 1,000 records personal data containing the following columns: ID, First Name, Last Name, Email, SSN, and Country Code. To create the mapping, we selected the Auto Mapping type, and CluedIn automatically generated the mapping for the data set. Since our data set included the 'ID' column, it was automatically selected as the entity origin code. This is a favorable option because no empty or duplicate values were found during the current data set check. It means that the ID is a reliable value to uniquely represent the record in CluedIn. +We ingested a file with 1,000 records personal data containing the following columns: ID, First Name, Last Name, Email, SSN, and Country Code. To create the mapping, we selected the Auto Mapping type, and CluedIn automatically generated the mapping for the data set. Since our data set included the 'ID' column, it was automatically selected as the primary identifier. This is a favorable option because no empty or duplicate values were found during the current data set check. It means that the ID is a reliable value to uniquely represent the record in CluedIn. ![review-mapping-3.png](../../assets/images/integration/data-sources/review-mapping-3.png) -If we select a column that contains duplicate values (for example, country), the status check will immediately inform us of the number of duplicate values in the data set. By selecting **View more details**, you can view the number of duplicate values in the data set, which values are duplicates, and the number of times the duplicate value occurs in the data set. Referring to the screenshot below, there are 3 duplicate values in the data set: United States, Canada, and Spain. The value United States occurs in 550 records. If we proceed with this as the entity origin code and process the data, all 550 records will be merged into a single golden record. In this case, the country cannot serve as a unique representation for each record, as it is acceptable for records to share the same country. +If we select a column that contains duplicate values (for example, country), the status check will immediately inform us of the number of duplicate values in the data set. By selecting **View more details**, you can view the number of duplicate values in the data set, which values are duplicates, and the number of times the duplicate value occurs in the data set. Referring to the screenshot below, there are 3 duplicate values in the data set: United States, Canada, and Spain. The value United States occurs in 550 records. If we proceed with this as the primary identifier and process the data, all 550 records will be merged into a single golden record. In this case, the country cannot serve as a unique representation for each record, as it is acceptable for records to share the same country. ![review-mapping-4.png](../../assets/images/integration/data-sources/review-mapping-4.png) -However, if you are confident that the selected property can uniquely represent the record, you can proceed with processing the data. Records with identical entity origin codes will be automatically merged, eliminating the need for a separate deduplication project. +However, if you are confident that the selected property can uniquely represent the record, you can proceed with processing the data. Records with identical primary identifiers will be automatically merged, eliminating the need for a separate deduplication project. {:.important} -If the entity origin code contains an empty value, it is replaced with a hash. If your data set contains a record that is nearly identical to the one with the hash value, they will not be merged because the hash is a unique value. In such cases, you can initiate a deduplication project as a solution. +If the primary identifier contains an empty value, it is replaced with a hash. If your data set contains a record that is nearly identical to the one with the hash value, they will not be merged because the hash is a unique value. In such cases, you can initiate a deduplication project as a solution. -**What is an entity code?** +**What are identifiers?** -An entity code is an additional identifier that uniquely represents the record in CluedIn. If two entity codes are identical, the records will be merged. CluedIn automatically identifies properties that can be used as additional identifiers. For example, if the primary identifier (entity origin code) is the ID, then the additional identifier (entity code) could be the email. +Identifiers can uniquely represent the record in CluedIn, in addition to the primary identifier. If two identifiers are identical, the records will be merged. CluedIn automatically detects properties that can be used as additional identifiers. For example, if the primary identifier is the ID, then the additional identifier could be the email. -Even if there are no duplicate values according to the entity origin code, but there are some according to the entity code, the records will be merged. +Even if there are no duplicate values according to the primary identifier, but there are some according to additional identifiers, the records will be merged. -The following diagram will help you in determining if you need to add entity codes. +The following diagram will help you in determining if you need to add additional identifiers. ![review-mapping-5.png](../../assets/images/integration/data-sources/review-mapping-5.png) @@ -102,7 +103,7 @@ You can add a relationship before or after you process the records. If you add a When you start creating a relationship, you have to select a **property** from the current data set that references another property existing in CluedIn. For instance, in the case of SQL tables, this property could be a foreign key that connects two tables. The relationship will be established based on the selected property. Then, you need to choose the **edge mode**: -- **Edge** – CluedIn creates relationships between the records based on the origin code. +- **Edge** – CluedIn creates relationships between the records based on the origin. - **Strict Edge** – CluedIn creates relationships between the records that belong to a specific data set, data source, or data source group. @@ -110,12 +111,6 @@ When you start creating a relationship, you have to select a **property** from t After you select the edge mode, you need to choose the **edge type** to define the nature of relationships between records (for example, /WorksFor, /RequestedBy,/LocatedIn). -**Where can you find the origin code?** - -An origin code is a label or a keyword that briefly describes the origin of the record. To find the origin code, open any record and select **View Codes**. On the **Entity Codes** pane, you can find all codes associated with the record. In the example below, the code consists of entity type (/TrainingCompany), origin code (File Data Source), and primary unique identifier (1). - -![review-mapping-6.png](../../assets/images/integration/data-sources/review-mapping-6.png) - **Example** We have 2 data sets: diff --git a/docs/040-integration/data-sources/190-process-data.md b/docs/040-integration/data-sources/190-process-data.md index 82146ca6..f140cc0c 100644 --- a/docs/040-integration/data-sources/190-process-data.md +++ b/docs/040-integration/data-sources/190-process-data.md @@ -23,7 +23,7 @@ Depending on the type of data source, there are three processing options: - For ingestion endpoint only: [Bridge mode](#bridge-mode) -You can process the data set as many times as you want. In CluedIn, once a record has been processed, it won’t undergo processing again. When the processing is started, CluedIn checks for identical records. If identical records are found, they won’t be processed again. However, if you change the origin code for the previously processed records, CluedIn will treat these records as new and process them. +You can process the data set as many times as you want. In CluedIn, once a record has been processed, it won’t undergo processing again. When the processing is started, CluedIn checks for identical records. If identical records are found, they won’t be processed again. However, if you change the primary identifier for the previously processed records, CluedIn will treat these records as new and process them. After the processing is completed, the [processing log](#processing-logs) appears in the table. Any records that fail to meet specific conditions outlined in [property](/integration/additional-operations-on-records/property-rules) or [pre-process](/integration/additional-operations-on-records/preprocess-rules) rules will be sent to quarantine. To learn more about managing these records, see [Quarantine](/integration/additional-operations-on-records/quarantine). Records that were processed successfully are displayed on the **Data** tab. @@ -45,12 +45,12 @@ Manual processing is available for the data coming from a file, an ingestion end - Delete the records that are currently in quarantine. This option is useful if you have already processed the data set before and there are some records in quarantine. - - View the result of the origin entity code status check along with the field that was selected for producing the entity origin code. + - View the result of the primary identifier status check along with the field that was selected for producing the primary identifier. - - View the result of the code status check along with the field that was selected for producing the entity code. + - View the result of the identifier status check along with the field that was selected for producing additional identifier. {:.important} - If any status check shows duplicates, the records containing duplicates will be merged to maintain data integrity and consistency. To learn more about unique identifiers, see [Codes](/integration/review-mapping#codes). + If any status check shows duplicates, the records containing duplicates will be merged to maintain data integrity and consistency. To learn more about unique identifiers, see [Identifiers](/integration/review-mapping#codes). 1. In the lower-right corner, select **Confirm**. diff --git a/docs/040-integration/manual-data-entry/010-configure-a-manual-data-entry-project.md b/docs/040-integration/manual-data-entry/010-configure-a-manual-data-entry-project.md new file mode 100644 index 00000000..029db0c0 --- /dev/null +++ b/docs/040-integration/manual-data-entry/010-configure-a-manual-data-entry-project.md @@ -0,0 +1,159 @@ +--- +layout: cluedin +nav_order: 010 +parent: Manual data entry +grand_parent: Integration +permalink: /integration/manual-data-entry/configure-a-manual-data-entry-project +title: Configure a manual data entry project +last_modified: 2025-04-01 +--- +## On this page +{: .no_toc .text-delta } +- TOC +{:toc} + +In this article, you will learn how to create and configure a manual data entry project to be able to add the records manually directly in CluedIn. + +A manual data entry project is an underlying component of manual data entry in CluedIn. It contains the configuration, mapping, and permissions for records created within the project. The process of creating a manual data entry project consists of 4 parts: + +1. Creating a manual data entry project with basic configuration – defining the business domain and vocabulary for the golden records that will be produced, as well as setting up the option to send records for approval before processing. + +1. Adding the form fields in a manual data entry project – defining the specific types of data that will be added manually in order to create a record. A field represents a property of a record. + +1. Reviewing and modifying the mapping configuration – ensuring that the primary identifier as well as additional identifiers for records in manual data entry project are configured correctly. + +1. Defining the quality of the manual data entry project source – defining quality is useful if you have [survivorship rules](/management/rules/rule-types#survivorship-rules) that determine which value from multiple sources should be used in a golden record based on the quality of the source. + +## Create a manual data entry project + +You can create as many manual data projects as you need. Consider having separate projects for different types of business data. For example, you can create a project for contact data, a project for product data, and a project for customer data. This way, you can better organize your manual records and ensure that each type of information is handled appropriately. + +**To create a manual data entry project** + +1. On the navigation pane, go to **Ingestion** > **Manual Data Entry**. + +1. Select **Create**. + +1. Enter the name of the manual data entry project. + +1. Select the business domain for records that will be created in the manual data entry project. + +1. Select the vocabulary that will be used in the manual data entry project. The vocabulary keys from this vocabulary will be available for selection when you [create](#create-form-fields) the form fields. + +1. If you want to send the records created in the manual data entry project by non-owner users for approval, select the checkbox in **Require records approval**. + + If you enable records approval, you, as the project owner, and any other owners will receive notifications when non-owner users try to add new records to the project. The owner needs to approve these records before they can be processed. For more information, see [Source record approval](/integration/additional-operations-on-records/source-records-approval). + +1. (Optional) Enter the description of the manual data entry project. + + ![manual-data-entry-create-project.png](../../assets/images/integration/manual-data-entry/manual-data-entry-create-project.png) + +1. Select **Create**. + + The manual data entry project page opens where you can proceed to add the form fields. + +## Create form fields + +A form field is an element in a manual data entry project that represents a property of a record. For example, if you created a manual data entry project for contact data, your form fields may include ID, First Name, Last Name, Email, and Phone Number. Essentially, the process of adding a manual record consists of entering values into the defined form fields. + +**To create a form field** + +1. In the manual data entry project, go to the **Form Fields** tab. + +1. Select **Create Form Field**. + +1. In the **Vocabulary Key** section, expand the dropdown list, and select the vocabulary key that will be used as a field for manual data entry. + + By default, the list includes the vocabulary keys that belong to the vocabulary that you selected when creating the project. If you want to add a vocabulary key from another vocabulary, clear the checkmark and then find and select the needed vocabulary key. + +1. Review the **Label** of the form field. This is the name of the field that will be displayed when adding a manual record. + + The label is added automatically based on the vocabulary key selected in the previous step. You can modify the label if needed. If you modify the label, the changes will only be visible in the manual data entry project, not in the vocabulary key. + +1. Select the **Form Field Type**: + + - **Text Field** – an input type where you can enter any text in the field. + + - **Pick List** – an input type where you can select an option from a predefined list. If you select this form field type, you need to add the options for the list. + + - **Toggle** – an input type where you can switch between two states—true or false. + +1. If you selected **Pick List** in the previous step, add the pick list items: + + 1. In the **Pick list items** field, enter an option for the list, and then select **+ Add item**. + + The pick list item appears in the **Added pick list items section**. + + 1. To add more pick list items, repeat step 6.1. + + ![manual-data-entry-pick-list.png](../../assets/images/integration/manual-data-entry/manual-data-entry-pick-list.png) + +1. If the field is required in a record, turn on the toggle for **Is Required**. + +1. If you want to restrict the input in the field to only existing values from the vocabulary key, turn on the toggle for **Only Existing Values**. This option is available only for the **Text Field** type only. + +1. If you want to use the field for producing the identifier for a record, turn on the toggle for **Use as identifier**. + +1. (Optional) Enter the description of the form field. + + ![manual-data-entry-create-form-field.png](../../assets/images/integration/manual-data-entry/manual-data-entry-create-form-field.png) + +1. Select **Create**. + + The form field is added to the **Form Fields** tab of the manual data entry project. + +1. To add more form fields, repeat steps 1–11. + + Once you have created all the form fields you need, [review](#review-and-modify-mapping-configuration) the mapping configuration for the records that will be created in a project and modify it as needed. + +On the **Form Fields** tab, the fields are displayed in the order in which they appear when you add a record. You can change the order of the fields if needed. To do this, on the right side of the row, open the three-dot menu, and then select where you want to move the row. + +![manual-data-entry-move-field.png](../../assets/images/integration/manual-data-entry/manual-data-entry-move-field.png) + +## Review and modify mapping configuration + +The mapping configuration for records from the manual data entry project is created automatically based on the details you provide when adding form fields. By default, the [primary identifier](/key-terms-and-features/entity-codes) and the [origin](/key-terms-and-features/origin) are auto-generated to ensure uniqueness of the records. Additionally, if you turned on the **Use as identifier** toggle for a form field, this field will be used to produce additional identifiers for the records. + +Note that the default mapping configuration does not include the name for the records. This is the name that is displayed during search as well as on the golden record details page. If you do not select the field for producing the name, CluedIn will use automatically generated record ID. + +**To review and modify mapping configuration** + +1. In the manual data entry project, go to the **Map** tab. You will see the default mapping configuration. + + ![manual-data-entry-mapping.png](../../assets/images/integration/manual-data-entry/manual-data-entry-mapping.png) + +1. Select **Edit mapping**. + +1. In the **General details** section, select the field that will be used to produce the name for a record once it has been processed. You can add multiple fields. + + ![manual-data-entry-mapping-general-details.png](../../assets/images/integration/manual-data-entry/manual-data-entry-mapping-general-details.png) + +1. In the **Primary identifier** section, review the default configuration for producing a primary identifier: origin and property (field). You can select another origin and field if needed. + + ![manual-data-entry-mapping-primary-identifier.png](../../assets/images/integration/manual-data-entry/manual-data-entry-mapping-primary-identifier.png) + +1. In the **Identifiers** section, review the default configuration for producing additional identifiers. You can edit or delete the default identifier as well as add new additional identifiers. + + ![manual-data-entry-mapping-identifiers.png](../../assets/images/integration/manual-data-entry/manual-data-entry-mapping-identifiers.png) + +1. Select **Finish**. + + Once you have reviewed and modified the mapping configuration as needed, you can proceed to [define](#define-quality-of-manual-data-entry-project-source) the quality of the manual data entry project source. This is only necessary if you use survivorship rules that determine the winning value based on the quality of the source. If you do not use such survivorship rules, you can proceed to [add](/integration/manual-data-entry/add-records-in-a-manual-data-entry-project) the records manually. + +## Define quality of manual data entry project source + +If you have [survivorship rules](/management/rules/rule-types#survivorship-rules) that determine which value from multiple sources should be used in a golden record based on the quality of the source, then you need to define the quality of the manual data entry project. If you believe that the values from the manual data entry project are of higher quality and more trustworthy than those from other sources, you can assign a higher quality rating for the manual data entry project. This way, in case of conflicting values between the manual data entry project and another source, CluedIn will prioritize the value from a manual data entry project. + +**To define quality of manual data entry project source** + +1. In the manual data entry project, go to the **Quality** tab. + +1. In the **Source** section, select the category that best describes the manual data entry project. + +1. In the **Source Quality** section, define the quality rating for the source by dragging the slider towards **Lower Quality** or **Higher Quality**. + + ![manual-data-entry-quality.png](../../assets/images/integration/manual-data-entry/manual-data-entry-quality.png) + +1. Select **Save**. + + The quality of the manual data entry project is updated. Next, you can proceed to [add](/integration/manual-data-entry/add-records-in-a-manual-data-entry-project) the records manually. diff --git a/docs/040-integration/manual-data-entry/020-add-records-in-a-manual-data-entry-project.md b/docs/040-integration/manual-data-entry/020-add-records-in-a-manual-data-entry-project.md new file mode 100644 index 00000000..22d7186c --- /dev/null +++ b/docs/040-integration/manual-data-entry/020-add-records-in-a-manual-data-entry-project.md @@ -0,0 +1,100 @@ +--- +layout: cluedin +nav_order: 020 +parent: Manual data entry +grand_parent: Integration +permalink: /integration/manual-data-entry/add-records-in-a-manual-data-entry-project +title: Add records in a manual data entry project +last_modified: 2025-04-01 +--- +## On this page +{: .no_toc .text-delta } +- TOC +{:toc} + +In this article, you will learn how to add the records manually directly in CluedIn. + +Once you have created a manual data entry project and added all the form fields you need, you can start adding the records manually. There are two options for adding the records manually: + +- Adding single records – use this option if you want to add the records individually one by one. Right after you fill out all the required fields for a record, you can generate the record and publish it to CluedIn. + +- Adding multiple records – use this option if you want to add the records in one session in a tabular view. You can add as many records as you need in one session, and then generate all records and publish them to CluedIn simultaneously. + +## Add single record + +The process of adding a single record consists of filling out all the required fields and then generating the record. You can add records one at a time. There are two ways to add a record: + +- [In the manual data entry project](#add-single-record-in-the-manual-data-entry-project) +- [From the action center](#add-a-record-from-the-action-center) + +### Add a record in the manual data entry project + +1. On the navigation pane, go to **Ingestion** > **Manual Data Entry**. + +1. Find and open the project where you want to add a record. + +1. In the upper-right corner of the project page, select **Add** > **Add single record**. + +1. Fill out the fields for a record. + + ![manual-data-entry-add-single-record.png](../../assets/images/integration/manual-data-entry/manual-data-entry-add-single-record.png) + +1. In the upper-right corner of the page, select **Generate**. + + After the record is processed, you can find it on the **Data** tab of the manual data entry project. To add more records, repeat steps 3–5. + +### Add a record from the action center + +1. On the navigation pane, select **Create**. + +1. Select **Enter data manually**. + + The **Manual data entry** pane opens to the right side of the page. + +1. In the **Name** dropdown list, find and select the manual data entry project where you want to add a record. + + After selecting a project, the relevant form fields will appear. + +1. Fill out the fields for a record. + + ![manual-data-entry-add-in-action-center.png](../../assets/images/integration/manual-data-entry/manual-data-entry-add-in-action-center.png) + +1. Select **Add record**. + + After the record is processed, you can find it on the **Data** tab of the manual data entry project you selected in step 3. To add more records, repeat steps 2–4. + +## Add multiple records + +The process of adding multiple records requires creating a session, which represents a workspace with a tabular view. There, you can fill out the fields for all records and then generate the records simultaneously. You can add multiple records in the tabular view. + +**Prerequisites** + +Go to **Administration** > **Feature Flags**, and then enable the **Manual data entry tabular view** feature. + +![manual-data-entry-tabular-view-feature-flag.png](../../assets/images/integration/manual-data-entry/manual-data-entry-tabular-view-feature-flag.png) + +**To add multiple records** + +1. In the upper-right corner of the manual data entry project page, select **Add** > **Add multiple records**. + +1. Enter the name of the session, and the select **Create**. + + A manual data entry session opens, where you can enter the data manually in a tabular format. One row represents one record, and the columns represent the form fields. To add more rows, select **Add row**. + +1. Enter the data in the fields. + + ![manual-data-entry-multiple-records.png](../../assets/images/integration/manual-data-entry/manual-data-entry-multiple-records.png) + + If you decide that you no longer need the record, you can delete it by selecting the delete icon in the **Actions** column. + + If you want to take a break from entering records, you can close the session and return to it later. Any records you have entered will be saved. + +1. After you have entered all records, send them for processing: + + 1. Select **Generate**. + + 1. If you want to remove the session once the records have been processed, select the **Cleanup session** checkbox. If you do not select this checkbox, the session will remain on the **Sessions** tab. + + 1. Select **Confirm**. + + Once processed, the records appear on the **Data** tab of the manual data entry project. \ No newline at end of file diff --git a/docs/040-integration/manual-data-entry/030-manage-a-manual-data-entry-project.md b/docs/040-integration/manual-data-entry/030-manage-a-manual-data-entry-project.md new file mode 100644 index 00000000..5367d99b --- /dev/null +++ b/docs/040-integration/manual-data-entry/030-manage-a-manual-data-entry-project.md @@ -0,0 +1,95 @@ +--- +layout: cluedin +nav_order: 030 +parent: Manual data entry +grand_parent: Integration +permalink: /integration/manual-data-entry/manage-a-manual-data-entry-project +title: Manage a manual data entry project +last_modified: 2025-04-01 +--- +## On this page +{: .no_toc .text-delta } +- TOC +{:toc} + +In this article you will learn how to manage a manual data entry project: edit the project configuration and form fields and manage access to the project and data from the project. + +## Edit manual data entry project configuration + +Once you have created a manual data entry project, you can change its configuration if needed. You can edit the following configuration details of a manual data entry project: project name, vocabulary, record approval option, and description. You cannot change the business domain. + +**To edit manual data entry project configuration** + +1. On the navigation pane, go to **Ingestion** > **Manual Data Entry**. + +1. Find and open the project that you want to edit. + +1. In the upper-right corner, select **Edit**. + +1. Make the needed changes. + +1. Select **Save**. + + The manual data entry project configuration is updated. + +## Edit form fields + +Once you have created the form fields, you can modify them if needed. You can edit all the details of a form field. Additionally, if you no longer want to use the form field to create manual records, you can archive such form field. + +Note that the changes you make in the form fields do not affect the records that have been already generated and processed. + +**To edit a form field** + +1. In the manual data entry project, go to the **Form Fields** tab. + +1. Select the form field that you want to edit. + +1. Make the needed changes. + +1. Select **Save**. + + The form field configuration is updated. + +**To archive a form field** + +1. In the manual data entry project, go to the **Form Fields** tab. + +1. Select the form field that you want to archive. + +1. Select **Archive**, and then select **Confirm**. + + The status of the form field becomes **Archived**. This form field is no longer available when adding new manual records. + +## Manage sessions + +If you used an option to add multiple records simultaneously, then your manual data entry project contains sessions that store such records. These sessions are available on the **Sessions** tab. The sessions can have one of the following statuses: + +- **Not generated** – this status means that either the session does not contain any records, or it contains some records but they have not yet been generated and processed. You can open such session and add some records. When you select the session, it opens in a new tab. + +- **Generated** – this status means that the records from the session have been generated and processed. You can open such session and add more records if needed. When you generate the records, previously generated records will remain unchanged, while newly added records will be generated and processed. + +If you no longer need the session, you can delete it. This action is irreversible, and you will not be able to recover the deleted session. The records generated from such session will remain intact. + +**To delete a session** + +1. In the manual data entry project, go to the **Sessions** tab. + +1. Find the session that you want to delete. + +1. On the right side of the session row, open the three-dot menu, and then select **Delete**. + + ![manual-data-entry-delete-session.png](../../assets/images/integration/manual-data-entry/manual-data-entry-delete-session.png) + +1. Confirm that you want to delete the session. + + The session is no longer listed on the **Sessions** tab. + +## Manage access to manual data entry project + +The user who created the manual data entry project is the owner of the project. This user can make direct changes to the project, approve or reject changes submitted by non-owner users, process the records that require approval, as well as add other users or roles to the list of owners. You can find the list of users and/or roles who can manage the manual data entry project on the **Owners** tab of the manual data entry project. For more information about ownership, see [Feature access](/administration/user-access/feature-access). + +## Manage access to data from manual data entry project + +The user who created the manual data entry project has access to all records generated within the project. This user can grant permission to other users and/or roles to access the records generated within the project. On the **Permissions** tab of the manual data entry project, you can find the list of users and/or roles who have access to all records generated within the project. For more information about permissions, see [Data access](/administration/user-access/data-access). + +If the user is not listed on the **Permissions** tab of the project, they will not be able to view the generated records on the **Data** tab. \ No newline at end of file diff --git a/docs/050-preparation/clean/030-clean-reference.md b/docs/050-preparation/clean/030-clean-reference.md index 053c0aa2..07191187 100644 --- a/docs/050-preparation/clean/030-clean-reference.md +++ b/docs/050-preparation/clean/030-clean-reference.md @@ -38,4 +38,21 @@ The following diagram shows the clean project workflow along with its statuses a ![clean-reference-1.png](../../assets/images/preparation/clean/clean-reference-1.png) -The **Archived** status is not shown in the diagram, but you can archive the clean project when it is in any status except **Generation aborting**, **Processing aborting**, and **Revert aborting**. \ No newline at end of file +The **Archived** status is not shown in the diagram, but you can archive the clean project when it is in any status except **Generation aborting**, **Processing aborting**, and **Revert aborting**. + +## Clean project audit log actions + +Whenever some changes or actions are made in the clean project, they are recorded and can be found on the **Audit Log** tab. These actions include the following: + +- Create a clean project +- Add users to owners +- Update a clean project +- Generate results +- Generate rules +- Commit a project +- Regenerate results +- Revert (undo) changes +- Cancel committing a project +- Cancel generation of results +- Cancel reverting (undoing) changes +- Archive a project \ No newline at end of file diff --git a/docs/050-preparation/enricher/010-concept-of-enricher.md b/docs/050-preparation/enricher/010-concept-of-enricher.md index 74a43879..1c5d62ab 100644 --- a/docs/050-preparation/enricher/010-concept-of-enricher.md +++ b/docs/050-preparation/enricher/010-concept-of-enricher.md @@ -39,7 +39,7 @@ When the enricher receives the vocabulary key value, it calls an external intern **From clue to golden record** -When a new clue appears in CluedIn from the enricher, it goes into the processing pipeline. It is important to note that such clue has the same entity origin code as the golden record. During processing, CluedIn transforms the clue into a data part and executes merging by codes to ensure that the new information seamlessly integrates with the existing golden record. To learn more about what happens to the clue during processing, see [Data life cycle](/key-terms-and-features/data-life-cycle). +When a new clue appears in CluedIn from the enricher, it goes into the processing pipeline. It is important to note that such clue has the same primary identifier as the golden record. During processing, CluedIn transforms the clue into a data part and executes merging by identifiers to ensure that the new information seamlessly integrates with the existing golden record. To learn more about what happens to the clue during processing, see [Data life cycle](/key-terms-and-features/data-life-cycle). {:.important} The processing of a clue from an enricher follows the same steps as any other clue within the system. \ No newline at end of file diff --git a/docs/050-preparation/enricher/020-add-enricher.md b/docs/050-preparation/enricher/020-add-enricher.md index 088813ad..e5f6e2ee 100644 --- a/docs/050-preparation/enricher/020-add-enricher.md +++ b/docs/050-preparation/enricher/020-add-enricher.md @@ -57,7 +57,7 @@ If you process the data and then add an enricher, the enrichment won't start aut 1. On the navigation pane, go to **Consume** > **GraphQL**. -1. Enter a query to enrich all golden records that belong to a certain entity type. Replace _/Organization_ with the needed name of entity type. +1. Enter a query to enrich all golden records that belong to a certain business domain. Replace _/Organization_ with the needed name of business domain. ``` { @@ -73,7 +73,7 @@ If you process the data and then add an enricher, the enrichment won't start aut 1. Execute the query. - You triggered the enrichment for the golden records belonging to the specified entity type. Now, you can view the enrichment results on the golden record details page. + You triggered the enrichment for the golden records belonging to the specified business domain. Now, you can view the enrichment results on the golden record details page. **To trigger enrichment for each golden record manually** diff --git a/docs/050-preparation/enricher/030-enricher-reference.md b/docs/050-preparation/enricher/030-enricher-reference.md index 8f5651d5..7f61ad17 100644 --- a/docs/050-preparation/enricher/030-enricher-reference.md +++ b/docs/050-preparation/enricher/030-enricher-reference.md @@ -16,6 +16,19 @@ In this article, you will find reference information about built-in enrichers in {:.important} Please note that the enrichers are not included in the CluedIn license. Each enricher is an open-source package provided by the CluedIn team for free to help you enrich your golden records with information from external sources. +## Azure OpenAI + +The [Azure OpenAI](/preparation/enricher/azure-openai) enricher allows you to enhance data quality by providing more complete, current, and detailed information for your golden records. It supports the following endpoints: + +- `{baseUrl}/openai/deployments/{deploymentName}/completions?api-version=2022-12-01` + +- `{baseUrl}/openai/deployments/deploymentName}/chat/completions?api-version=2024-06-01` + +| Package name | Package version | Source code | +|--|--|--| +| CluedIn.Enricher.AzureOpenAI | 4.4.0 | [Source code](https://github.com/CluedIn-io/CluedIn.Enricher.AzureOpenAI/releases/tag/4.4.0) | + + ## Brreg The [Brreg](/preparation/enricher/brreg) enricher retrieves a wide range of information about Norwegian and foreign businesses operating in Norway. It supports the following endpoints: diff --git a/docs/050-preparation/enricher/040-brreg.md b/docs/050-preparation/enricher/040-brreg.md index 2b2c20c5..24b058bb 100644 --- a/docs/050-preparation/enricher/040-brreg.md +++ b/docs/050-preparation/enricher/040-brreg.md @@ -45,21 +45,21 @@ The enricher requires at least one of the following attributes for searching the 1. On the **Configure** tab, provide the following details: - 1. **Accepted Entity Type** – enter the entity type to define which golden records will be enriched. + 1. **Accepted Business Domain** – enter the business domain to define which golden records will be enriched. 1. **Name Vocabulary Key** – enter the vocabulary key that contains the names of companies that will be used for searching the Brreg register. 1. **Country Code Vocabulary Key** – enter the vocabulary key that contains the country codes of companies that will be used for searching the Brreg register. + ![brreg-enricher-config-1.png](../../assets/images/preparation/enricher/brreg-enricher-config-1.png) + 1. **Website Vocabulary Key** – enter the vocabulary key that contains the websites of companies that will be used for searching the Brreg register. 1. **Brreg Code Vocabulary Key** – enter the vocabulary key that contains the Brreg codes of companies that will be used for searching the Brreg register. - 1. **Skip entity Code Creation (Brreg Code)** – turn on the toggle if you don't want to add new entity codes that come from the source system to the enriched golden records. Otherwise, new entity codes containing Brregs codes will be added to the enriched golden records. - - ![brreg-enricher-4.png](../../assets/images/preparation/enricher/brreg-enricher-4.png) + ![brreg-enricher-config-2.png](../../assets/images/preparation/enricher/brreg-enricher-config-2.png) -1. Select **Add**. +1. Select **Test Connection** to make sure the enricher is properly configured, and then select **Add**. The Brreg enricher is added and has an active status. This means that it will enrich relevant golden records during processing or when you trigger external enrichment. @@ -67,8 +67,6 @@ After the Brreg enricher is added, you can modify its details: - **Settings** – add a user-friendly display name, select the description for data coming from the enricher, and define the source quality for determining the winning values. - ![brreg-enricher-2.png](../../assets/images/preparation/enricher/brreg-enricher-2.png) - - **Authentication** – modify the details you provided while configuring the enricher. ## Properties from Brreg enricher diff --git a/docs/050-preparation/enricher/050-clearbit.md b/docs/050-preparation/enricher/050-clearbit.md index 0819281d..7a777d61 100644 --- a/docs/050-preparation/enricher/050-clearbit.md +++ b/docs/050-preparation/enricher/050-clearbit.md @@ -37,17 +37,17 @@ The enricher requires at least one of the following attributes to search for com 1. On the **Configure** tab, provide the following details: - - **Accepted Entity Type** – enter the entity type to define which golden records will be enriched. + - **Accepted Business Domain** – enter the business domain to define which golden records will be enriched. - - **Website vocab key** – enter the vocabulary key that contains company websites that will be used to search for company domain and logo. + - **Website Vocabulary Key** – enter the vocabulary key that contains company websites that will be used to search for company domain and logo. - - **Organization Name vocab key** – enter the vocabulary key that contains company names that will be used to search for company domain and logo. + - **Organization Name Vocabulary Key** – enter the vocabulary key that contains company names that will be used to search for company domain and logo. - - **Email Domain vocab key** – enter the vocabulary key that contains company email domains that will be used to search for company domain and logo. + - **Email Domain Vocabulary Key** – enter the vocabulary key that contains company email domains that will be used to search for company domain and logo. ![clearbit-enricher-2.png](../../assets/images/preparation/enricher/clearbit-enricher-2.png) -1. Select **Add**. +1. Select **Test Connection** to make sure the enricher is properly configured, and then select **Add**. The Clearbit enricher is added and has the active status. It means that it will enrich relevant golden records when they are processed or when your trigger external enrichment. @@ -55,8 +55,6 @@ After the Clearbit enricher is added, you can modify its details: - **Settings** – add a user-friendly display name, select the description for data coming from the enricher, and define the source quality for determining the winning values. - ![clearbit-enricher-3.png](../../assets/images/preparation/enricher/clearbit-enricher-3.png) - - **Authentication** – modify the details you provided while configuring the enricher. ## Properties from Clearbit enricher diff --git a/docs/050-preparation/enricher/060-companies-house.md b/docs/050-preparation/enricher/060-companies-house.md index 76341cd6..432047b7 100644 --- a/docs/050-preparation/enricher/060-companies-house.md +++ b/docs/050-preparation/enricher/060-companies-house.md @@ -43,21 +43,19 @@ You can add input parameters for the enricher (organization name, country, and C - **API Key** – enter the API key for retrieving information from the Companies House website. - - **Accepted Entity Type** – enter the entity type to define which golden records will be enriched. + - **Accepted Business Domain** – enter the business domain to define which golden records will be enriched. - - **Companies House Number Vocab Key** – enter the vocabulary key that contains the Companies House number that will be used for searching the Companies House website. + - **Companies House Number Vocabulary Key** – enter the vocabulary key that contains the Companies House number that will be used for searching the Companies House website. - - **Country Vocab Key** – enter the vocabulary key that contains the countries of companies that will be used for searching the Companies House website. + ![comapnies-house-enricher-config-1.png](../../assets/images/preparation/enricher/comapnies-house-enricher-config-1.png) - - **Organization Name Vocab Key** – enter the vocabulary key that contains the names of companies that will be used for searching the Companies House website. + - **Country Vocabulary Key** – enter the vocabulary key that contains the countries of companies that will be used for searching the Companies House website. - - **Skip Entity Code Creation (Company House Number)** – turn on the toggle if you don't want to add new entity codes that come from the source system to the enriched golden records. Otherwise, new entity codes containing Companies House number will be added to the enriched golden records. + - **Organization Name Vocabulary Key** – enter the vocabulary key that contains the names of companies that will be used for searching the Companies House website. - - **Skip Entity Code Creation (Company Name)** – turn on the toggle if you don't want to add new entity codes that come from the source system to the enriched golden records. Otherwise, new entity codes containing company names will be added to the enriched golden records. + ![comapnies-house-enricher-config-2.png](../../assets/images/preparation/enricher/comapnies-house-enricher-config-2.png) - ![comapnies-house-enricher-2.png](../../assets/images/preparation/enricher/comapnies-house-enricher-2.png) - -1. Select **Add**. +1. Select **Test Connection** to make sure the enricher is properly configured, and then select **Add**. The Companies House enricher is added and has an active status. This means that it will enrich golden records based on the configuration details during processing or when you trigger external enrichment. @@ -65,8 +63,6 @@ After the Companies House enricher is added, you can modify its details: - **Settings** – add a user-friendly display name, select the description for data coming from the enricher, and define the source quality for determining the winning values. - ![comapnies-house-enricher-3.png](../../assets/images/preparation/enricher/comapnies-house-enricher-3.png) - - **Authentication** – modify the details you provided while configuring the enricher. ## Properties from Companies House enricher diff --git a/docs/050-preparation/enricher/070-cvr.md b/docs/050-preparation/enricher/070-cvr.md index 32758085..3c8db291 100644 --- a/docs/050-preparation/enricher/070-cvr.md +++ b/docs/050-preparation/enricher/070-cvr.md @@ -51,23 +51,25 @@ The enricher requires at least one of the following attributes for searching the 1. On the **Configure** tab, provide the following details: - - **Accepted Entity Type** – enter the entity type to define which golden records will be enriched. + - **Accepted Business Domain** – enter the business domain to define which golden records will be enriched. - - **Organization Name Vocab Key** – enter the vocabulary key that contains company names that will be used for searching the CVR register. + - **Organization Name Vocabulary Key** – enter the vocabulary key that contains company names that will be used for searching the CVR register. - **Organization Name Normalization** – turn on the toggle if you want to normalize company names that will be used for searching the CVR register. The normalization removes trailing backslashes (\), slashes (/), and vertical bars; also, it changes the names to lowercase. The normalization does not affect company names in CluedIn. - **Match Past Organization Names** – turn on the toggle if you want to allow the enricher to accept data that matches the search text (organization name), even if the name of the latest data in the CVR register doesn't exactly match the search text. For example, Pfizer ApS is one of the old names of Pfizer A/S; by turning on the toggle, you can search for Pfizer ApS using the search text (Pfizer ApS). - - **CVR Vocab Key** – enter the vocabulary key that contains company CVR codes that will be used for searching the CVR register. + ![cvr-enricher-config-1.png](../../assets/images/preparation/enricher/cvr-enricher-config-1.png) - - **Country Vocab Key** – enter the vocabulary key that contains company countries that will be used for searching the CVR register. + - **CVR Vocabulary Key** – enter the vocabulary key that contains company CVR codes that will be used for searching the CVR register. - - **Website Vocab Key** – enter the vocabulary key that contains company websites that will be used for searching the CVR register. + - **Country Vocabulary Key** – enter the vocabulary key that contains company countries that will be used for searching the CVR register. - ![cvr-enricher-2.png](../../assets/images/preparation/enricher/cvr-enricher-2.png) + - **Website Vocabulary Key** – enter the vocabulary key that contains company websites that will be used for searching the CVR register. -1. Select **Add**. + ![cvr-enricher-config-2.png](../../assets/images/preparation/enricher/cvr-enricher-config-2.png) + +1. Select **Test Connection** to make sure the enricher is properly configured, and then select **Add**. The CVR enricher is added and has an active status. This means that it will enrich golden records based on the configuration details during processing or when you trigger external enrichment. @@ -75,9 +77,7 @@ After the CVR enricher is added, you can modify its details: - **Settings** – add a user-friendly display name, select the description for data coming from the enricher, and define the source quality for determining the winning values. - ![cvr-enricher-3.png](../../assets/images/preparation/enricher/cvr-enricher-3.png) - -- **Authentication** – modify the details you provided while configuring the enricher: **Accepted Entity Type**, **Organization Name Vocab Key**, **Organization Name Normalization**, **CVR Vocab Key**, **Country Vocab Key**, **Website Vocab Key**. +- **Authentication** – modify the details you provided while configuring the enricher. ## Properties from CVR enricher diff --git a/docs/050-preparation/enricher/080-duckduckgo.md b/docs/050-preparation/enricher/080-duckduckgo.md index 68e2c5f3..02cec90b 100644 --- a/docs/050-preparation/enricher/080-duckduckgo.md +++ b/docs/050-preparation/enricher/080-duckduckgo.md @@ -31,15 +31,15 @@ The enricher uses the organization name and/or the website to search for informa 1. On the **Configure** tab, provide the following details: - - **Accepted Entity Type** – enter the entity type to define which golden records will be enriched. + - **Accepted Business Domain** – enter the business domain to define which golden records will be enriched. - - **Organization Name Vocab Key** – enter the vocabulary key that contains the names of companies that will be used for searching the DuckDuckGo engine. + - **Organization Name Vocabulary Key** – enter the vocabulary key that contains the names of companies that will be used for searching the DuckDuckGo engine. - - **Website Vocab Key** – enter the vocabulary key that contains the websites of companies that will be used for searching the DuckDuckGo engine. + - **Website Vocabulary Key** – enter the vocabulary key that contains the websites of companies that will be used for searching the DuckDuckGo engine. ![duck-duck-go-enricher-2.png](../../assets/images/preparation/enricher/duck-duck-go-enricher-2.png) -1. Select **Add**. +1. Select **Test Connection** to make sure the enricher is properly configured, and then select **Add**. The DuckDuckGo enricher is added and has an active status. This means that it will enrich golden records based on the configuration details during processing or when you trigger external enrichment. @@ -47,9 +47,7 @@ After the DuckDuckGo enricher is added, you can modify its details: - **Settings** – add a user-friendly display name, select the description for data coming from the enricher, and define the source quality for determining the winning values. - ![duck-duck-go-enricher-3.png](../../assets/images/preparation/enricher/duck-duck-go-enricher-3.png) - -- **Authentication** – modify the details you provided while configuring the enricher: **Accepted Entity Type**, **Organization Name Vocab Key**, **Organization Website Vocab Key**. +- **Authentication** – modify the details you provided while configuring the enricher. ## Properties from DuckDuckGo enricher diff --git a/docs/050-preparation/enricher/090-gleif.md b/docs/050-preparation/enricher/090-gleif.md index bc1790f9..47bfc2f5 100644 --- a/docs/050-preparation/enricher/090-gleif.md +++ b/docs/050-preparation/enricher/090-gleif.md @@ -31,7 +31,7 @@ The enricher uses the Legal Entity Identifier (LEI) code to search for informati 1. On the **Configure** tab, provide the following details: - - **Accepted Entity Type** – enter the entity type to define which golden records will be enriched. + - **Accepted Business Domain** – enter the business domain to define which golden records will be enriched. - **Lei Code Vocabulary Key** – enter the vocabulary key that contains LEI codes of companies that you want to enrich. @@ -39,7 +39,7 @@ The enricher uses the Legal Entity Identifier (LEI) code to search for informati ![gleif-enricher-5.png](../../assets/images/preparation/enricher/gleif-enricher-5.png) -1. Select **Add**. +1. Select **Test Connection** to make sure the enricher is properly configured, and then select **Add**. The Gleif enricher is added and has an active status. This means that it will enrich relevant golden records during processing or when you trigger external enrichment. @@ -47,8 +47,6 @@ After the Gleif enricher is added, you can modify its details: - **Settings** – add a user-friendly display name, select the description for data coming from the enricher, and define the source quality for determining the winning values. - ![gleif-enricher-2.png](../../assets/images/preparation/enricher/gleif-enricher-2.png) - - **Authentication** – modify the details you provided while configuring the enricher. ## Properties from Gleif enricher diff --git a/docs/050-preparation/enricher/110-google-maps.md b/docs/050-preparation/enricher/110-google-maps.md index d0a684e2..ee98e138 100644 --- a/docs/050-preparation/enricher/110-google-maps.md +++ b/docs/050-preparation/enricher/110-google-maps.md @@ -57,7 +57,7 @@ The Google Maps enricher can use a variety of attributes for searching the Googl - For User: **User Address** -When you're configuring the Google Maps enricher for a specific entity type, make sure you fill in the relevant fields for that entity type. +When you're configuring the Google Maps enricher for a specific business domain, make sure you fill in the relevant fields for that business domain. **To add the Google Maps enricher** @@ -71,37 +71,39 @@ When you're configuring the Google Maps enricher for a specific entity type, mak - **API Key** – enter the API key for retrieving information from the Google Maps Platform. - - **Accepted Entity Type** – enter the entity type to define which golden records will be enriched. + - **Accepted Business Domain** – enter the business domain to define which golden records will be enriched. - **Vocabulary Key used to control whether it should be enriched** – enter the vocabulary key that indicates if the golden record should be enriched. If the value is true, then the golden record will be enriched. Otherwise, the golden record will not be enriched. - - **Organization Name Vocab Key** – enter the vocabulary key that contains company names that will be used for searching the Google Maps Platform. + ![google-maps-enricher-config-1.png](../../assets/images/preparation/enricher/google-maps-enricher-config-1.png) - - **Organization Address Vocab Key** – enter the vocabulary key that contains company addresses that will be used for searching the Google Maps Platform. + - **Organization Name Vocabulary Key** – enter the vocabulary key that contains company names that will be used for searching the Google Maps Platform. - - **Organization City Vocab Key** – enter the vocabulary key that contains cities that will be used for searching the Google Maps Platform. + - **Organization Address Vocabulary Key** – enter the vocabulary key that contains company addresses that will be used for searching the Google Maps Platform. - - **Organization Zip** – enter the vocabulary key that contains company ZIP Codes that will be used for searching the Google Maps Platform. + ![google-maps-enricher-config-2.png](../../assets/images/preparation/enricher/google-maps-enricher-config-2.png) - - **Organization State Vocab Key** – enter the vocabulary key that contains states that will be used for searching the Google Maps Platform. + - **Organization City Vocabulary Key** – enter the vocabulary key that contains cities that will be used for searching the Google Maps Platform. - - **Organization Country Vocab Key** – enter the vocabulary key that contains countries that will be used for searching the Google Maps Platform. + - **Organization Zip Vocabulary Key** – enter the vocabulary key that contains company ZIP Codes that will be used for searching the Google Maps Platform. - - **Location Address Vocab Key** – enter the vocabulary key that contains location addresses that will be used for searching the Google Maps Platform. + - **Organization State Vocabulary Key** – enter the vocabulary key that contains states that will be used for searching the Google Maps Platform. - - **User Address Vocab Key** – enter the vocabulary key that contains user addresses that will be used for searching the Google Maps Platform. + - **Organization Country Vocabulary Key** – enter the vocabulary key that contains countries that will be used for searching the Google Maps Platform. - - **Person Address Vocab Key** – enter the vocabulary key that contains person addresses that will be used for searching the Google Maps Platform. + - **Location Address Vocabulary Key** – enter the vocabulary key that contains location addresses that will be used for searching the Google Maps Platform. - - **Person Address City Vocab Key** – enter the vocabulary key that contains person cities that will be used for searching the Google Maps Platform. + - **User Address Vocabulary Key** – enter the vocabulary key that contains user addresses that will be used for searching the Google Maps Platform. - - **Latitude Vocab Key** – this field is not currently used for searching the Google Maps Platform. + - **Person Address Vocabulary Key** – enter the vocabulary key that contains person addresses that will be used for searching the Google Maps Platform. - - **Longitude Vocab Key** – this field is not currently used for searching the Google Maps Platform. + - **Person Address City Vocabulary Key** – enter the vocabulary key that contains person cities that will be used for searching the Google Maps Platform. - ![google-maps-enricher-2.png](../../assets/images/preparation/enricher/google-maps-enricher-2.png) + - **Latitude Vocabulary Key** – this field is not currently used for searching the Google Maps Platform. -1. Select **Add**. + - **Longitude Vocabulary Key** – this field is not currently used for searching the Google Maps Platform. + +1. Select **Test Connection** to make sure the enricher is properly configured, and then select **Add**. The Google Maps enricher is added and has an active status. This means that it will enrich golden records based on the configuration details during processing or when you trigger external enrichment. @@ -109,8 +111,6 @@ After the Google Maps enricher is added, you can modify its details: - **Settings** – add a user-friendly display name, select the description for data coming from the enricher, and define the source quality for determining the winning values. - ![google-maps-enricher-3.png](../../assets/images/preparation/enricher/google-maps-enricher-3.png) - - **Authentication** – modify the details you provided while configuring the enricher. ## Properties from Google Maps enricher diff --git a/docs/050-preparation/enricher/120-knowledge-graph.md b/docs/050-preparation/enricher/120-knowledge-graph.md index ed6d16a9..fbb4acaa 100644 --- a/docs/050-preparation/enricher/120-knowledge-graph.md +++ b/docs/050-preparation/enricher/120-knowledge-graph.md @@ -31,17 +31,17 @@ To use the Knowledge Graph enricher, you must provide the API key. To get the AP 1. On the **Configure** tab, provide the following details: - - **Key** – enter the API key for accessing Google’s Knowledge Graph database. + - **API Key** – enter the API key for accessing Google’s Knowledge Graph database. - - **Accepted Entity Type** – enter the entity type to define which golden records will be enriched using the Knowledge Graph enricher. + - **Accepted Business Domain** – enter the business domain to define which golden records will be enriched using the Knowledge Graph enricher. - - **Organization Name Vocab Key** – enter the vocabulary key that contains the names of organizations that will be used for searching the Knowledge Graph database. + - **Organization Name Vocabulary Key** – enter the vocabulary key that contains the names of organizations that will be used for searching the Knowledge Graph database. - - **Website Vocab Key** – enter the vocabulary key that contains the websites of organizations that will be used for searching the Knowledge Graph database. + - **Website Vocabulary Key** – enter the vocabulary key that contains the websites of organizations that will be used for searching the Knowledge Graph database. ![knowledge-graph-enricher-2.png](../../assets/images/preparation/enricher/knowledge-graph-enricher-2.png) -1. Select **Add**. +1. Select **Test Connection** to make sure the enricher is properly configured, and then select **Add**. The Knowledge Graph enricher is added and has an active status. This means that it will enrich golden records based on the configuration details during processing or when you trigger external enrichment. @@ -49,9 +49,7 @@ After the Knowledge Graph enricher is added, you can modify its details: - **Settings** – add a user-friendly display name, select the description for data coming from the enricher, and define the source quality for determining the winning values. - ![knowledge-graph-enricher-3.png](../../assets/images/preparation/enricher/knowledge-graph-enricher-3.png) - -- **Authentication** – modify the details you provided to configure the enricher: **Key**, **Accepted Entity Type**, **Organization Name Vocab Key**, and **Website Vocab Key**. +- **Authentication** – modify the details you provided to configure the enricher. ## Properties from Knowledge Graph enricher diff --git a/docs/050-preparation/enricher/130-libpostal.md b/docs/050-preparation/enricher/130-libpostal.md index 8c970572..2b557dfc 100644 --- a/docs/050-preparation/enricher/130-libpostal.md +++ b/docs/050-preparation/enricher/130-libpostal.md @@ -19,7 +19,7 @@ The Libpostal enricher supports the following endpoint: ## Add Libpostal enricher -The Libpostal enricher uses the address as input to parse and normalize the street address used in a golden record. You can use this enricher to parse and normalize street addresses for organizations, users, persons, and locations. Depending on the entity type you specify in the enricher configuration, you will need to provide the appropriate vocabulary key that contains the address. If you don't provide the vocabulary key, CluedIn will use the following vocabulary keys by default: +The Libpostal enricher uses the address as input to parse and normalize the street address used in a golden record. You can use this enricher to parse and normalize street addresses for organizations, users, persons, and locations. Depending on the business domain you specify in the enricher configuration, you will need to provide the appropriate vocabulary key that contains the address. If you don't provide the vocabulary key, CluedIn will use the following vocabulary keys by default: - **Person Address Vocab Key** - person.home.address @@ -39,19 +39,19 @@ The Libpostal enricher uses the address as input to parse and normalize the stre 1. On the **Configure** tab, provide the following details: - - **Accepted Entity Type** – enter the entity type to define which golden records will be enriched using the Libpostal enricher. Depending on the entity type that you provide, you need to fill out one more field to define the vocabulary key that contains addresses of golden records that you want to enrich. + - **Accepted Business Domain** – enter the business domain to define which golden records will be enriched using the Libpostal enricher. Depending on the business domain that you provide, you need to fill out one more field to define the vocabulary key that contains addresses of golden records that you want to enrich. - - **Person Address Vocab Key** – if you entered /Person as the accepted entity type, enter the vocabulary key that contains the home addresses of persons that you want to enrich. + - **Person Address Vocabulary Key** – if you entered /Person as the accepted business domain, enter the vocabulary key that contains the home addresses of persons that you want to enrich. - - **Organization Address Vocab Key** – if you entered /Organization as the accepted entity type, enter the vocabulary key that contains the addresses of organizations that you want to enrich. + - **Organization Address Vocabulary Key** – if you entered /Organization as the accepted business domain, enter the vocabulary key that contains the addresses of organizations that you want to enrich. - - **User Address Vocab Key** – if you entered /User as the accepted entity type, enter the vocabulary key that contains the addresses of users that you want to enrich. + - **User Address Vocabulary Key** – if you entered /User as the accepted business domain, enter the vocabulary key that contains the addresses of users that you want to enrich. - - **Location Address Vocab Key** – if you entered /Location as the accepted entity type, enter the vocabulary key that contains the addresses of locations that you want to enrich. + - **Location Address Vocabulary Key** – if you entered /Location as the accepted business domain, enter the vocabulary key that contains the addresses of locations that you want to enrich. ![libpostal-enricher-2.png](../../assets/images/preparation/enricher/libpostal-enricher-2.png) -1. Select **Add**. +1. Select **Test Connection** to make sure the enricher is properly configured, and then select **Add**. The Libpostal enricher is added and has an active status. This means that it will enrich golden records based on the configuration details during processing or when you trigger external enrichment. @@ -59,8 +59,6 @@ After the Libpostal enricher is added, you can modify its details: - **Settings** – add a user-friendly display name, select the description for data coming from the enricher, and define the source quality for determining the winning values. - ![libpostal-enricher-3.png](../../assets/images/preparation/enricher/libpostal-enricher-3.png) - - **Authentication** – modify the details you provided while configuring the enricher. ## Properties from Libpostal enricher diff --git a/docs/050-preparation/enricher/140-open-corporates.md b/docs/050-preparation/enricher/140-open-corporates.md index 660ecda0..0509ab41 100644 --- a/docs/050-preparation/enricher/140-open-corporates.md +++ b/docs/050-preparation/enricher/140-open-corporates.md @@ -33,17 +33,15 @@ To use the OpenCorporates enricher, you must provide the API token. To get the A 1. On the **Configure** tab, provide the following details: - - **API Token** – enter the API token for retrieving information from the OpenCorporates website. + - **API Key** – enter the API token for retrieving information from the OpenCorporates website. - - **Accepted Entity Type** – enter the entity type to define which golden records will be enriched. + - **Accepted Business Domain** – enter the business domain to define which golden records will be enriched. - **Lookup Vocabulary Key** – enter the vocabulary key that contains the names of companies that you want to enrich. - - **Skip Entity Code Creation (Company Number)** – turn on the toggle if you don't want to add new entity codes that come from the source system to the enriched golden records. Otherwise, new entity codes containing company numbers will be added to the enriched golden records. - ![open-corporates-enricher-2.png](../../assets/images/preparation/enricher/open-corporates-enricher-2.png) -1. Select **Add**. +1. Select **Test Connection** to make sure the enricher is properly configured, and then select **Add**. The OpenCorporates enricher is added and has an active status. This means that it will enrich golden records based on the configuration details during processing or when you trigger external enrichment. @@ -51,8 +49,6 @@ After the OpenCorporates enricher is added, you can modify its details: - **Settings** – add a user-friendly display name, select the description for data coming from the enricher, and define the source quality for determining the winning values. - ![open-corporates-enricher-3.png](../../assets/images/preparation/enricher/open-corporates-enricher-3.png) - - **Authentication** – modify the details you provided to configure the enricher. ## Properties from OpenCorporates enricher diff --git a/docs/050-preparation/enricher/150-permid.md b/docs/050-preparation/enricher/150-permid.md index 467cee7e..11e95622 100644 --- a/docs/050-preparation/enricher/150-permid.md +++ b/docs/050-preparation/enricher/150-permid.md @@ -35,13 +35,13 @@ To use the PermID enricher, you must provide the API key. To get the API key, re - **API Key** – enter the API token for accessing the PermID database. - - **Accepted Entity Type** – enter the entity type to define which golden records will be enriched using the PermID enricher. + - **Accepted Business Domain** – enter the business domain to define which golden records will be enriched using the PermID enricher. - - **Organization Name Vocab Key** – enter the vocabulary key that contains the names of organizations that will be used for retrieving information from the PermID database. + - **Organization Name Vocabulary Key** – enter the vocabulary key that contains the names of organizations that will be used for retrieving information from the PermID database. ![permid-enricher-2.png](../../assets/images/preparation/enricher/permid-enricher-2.png) -1. Select **Add**. +1. Select **Test Connection** to make sure the enricher is properly configured, and then select **Add**. The PermID enricher is added and has an active status. This means that it will enrich golden records based on the configuration details during processing or when you trigger external enrichment. @@ -49,9 +49,7 @@ After the PermID enricher is added, you can modify its details: - **Settings** – add a user-friendly display name, select the description for data coming from the enricher, and define the source quality for determining the winning values. - ![permid-enricher-3.png](../../assets/images/preparation/enricher/permid-enricher-3.png) - -- **Authentication** – modify the details you provided while configuring the enricher: **API Key**, **Accepted Entity Type**, **Organization Name Vocab Key**. +- **Authentication** – modify the details you provided while configuring the enricher. ## Properties from PermID enricher diff --git a/docs/050-preparation/enricher/160-vatlayer.md b/docs/050-preparation/enricher/160-vatlayer.md index bb017473..df4f5e91 100644 --- a/docs/050-preparation/enricher/160-vatlayer.md +++ b/docs/050-preparation/enricher/160-vatlayer.md @@ -31,15 +31,15 @@ To use the Vatlayer enricher, you will need to provide the API key. To get it, s 1. On the **Configure** tab, provide the following details: - - **API Key** – enter the API key for retrieving information from the Vatlayer website. + - **API Access Key** – enter the API key for retrieving information from the Vatlayer website. - - **Accepted Entity Type** – enter the entity type to define which golden records will be enriched using the Vatlayer enricher. + - **Accepted Business Domain** – enter the business domain to define which golden records will be enriched using the Vatlayer enricher. - - **Accepted Vocab Key** – enter the vocabulary key that contains the VAT numbers of companies that you want to enrich. + - **Accepted Vocabulary Key** – enter the vocabulary key that contains the VAT numbers of companies that you want to enrich. ![vatlayer-enricher-2.png](../../assets/images/preparation/enricher/vatlayer-enricher-2.png) -1. Select **Add**. +1. Select **Test Connection** to make sure the enricher is properly configured, and then select **Add**. The Vatlayer enricher is added and has an active status. This means that it will enrich golden records based on the configuration details during processing or when you trigger external enrichment. @@ -47,9 +47,7 @@ After the Vatlayer enricher is added, you can modify its details: - **Settings** – add a user-friendly display name, select the description for data coming from the enricher, and define the source quality for determining the winning values. - ![vatlayer-enricher-3.png](../../assets/images/preparation/enricher/vatlayer-enricher-3.png) - -- **Authentication** – modify the details you provided to configure the enricher: **API Key**, **Accepted Entity Type**, and **Accepted Vocab Key**. +- **Authentication** – modify the details you provided to configure the enricher. ## Properties from Vatlayer enricher diff --git a/docs/050-preparation/enricher/170-web.md b/docs/050-preparation/enricher/170-web.md index ddc9e8ef..7db3a9f9 100644 --- a/docs/050-preparation/enricher/170-web.md +++ b/docs/050-preparation/enricher/170-web.md @@ -29,13 +29,13 @@ The Web enricher uses the company website as an input for retrieving additional 1. On the **Configure** tab, provide the following details: - - **Accepted Entity Type** – enter the entity type to define which golden records will be enriched using the Web enricher. + - **Accepted Business Domain** – enter the business domain to define which golden records will be enriched using the Web enricher. - - **Website Vocab Key** – enter the vocabulary key that contains the websites of companies that you want to enrich. + - **Website Vocabulary Key** – enter the vocabulary key that contains the websites of companies that you want to enrich. ![web-enricher-2.png](../../assets/images/preparation/enricher/web-enricher-2.png) -1. Select **Add**. +1. Select **Test Connection** to make sure the enricher is properly configured, and then select **Add**. The Web enricher is added and has an active status. This means that it will enrich golden records based on the configuration details during processing or when you trigger external enrichment. @@ -43,9 +43,7 @@ After the Web enricher is added, you can modify its details: - **Settings** – add a user-friendly display name, select the description for data coming from the enricher, and define the source quality for determining the winning values. - ![web-enricher-3.png](../../assets/images/preparation/enricher/web-enricher-3.png) - -- **Authentication** – modify the details you provided while configuring the enricher: **Accepted Entity Type** and **Website Vocab Key**. +- **Authentication** – modify the details you provided while configuring the enricher. ## Properties from Web enricher diff --git a/docs/050-preparation/enricher/180-build-enricher.md b/docs/050-preparation/enricher/180-build-enricher.md index 9a93b543..6f24c6fb 100644 --- a/docs/050-preparation/enricher/180-build-enricher.md +++ b/docs/050-preparation/enricher/180-build-enricher.md @@ -40,7 +40,7 @@ You will need to install node, npm, yeoman, and the generator itself. npm install -g generator-cluedin-externalsearch ``` -1. Run the generator, providing `Name` and `Entity Type` (for example, Person or Company). +1. Run the generator, providing `Name` and `Business Domain` (for example, Person or Company). ```shell yo cluedin-externalsearch diff --git a/docs/050-preparation/enricher/190-azure-openai.md b/docs/050-preparation/enricher/190-azure-openai.md new file mode 100644 index 00000000..e7604f94 --- /dev/null +++ b/docs/050-preparation/enricher/190-azure-openai.md @@ -0,0 +1,112 @@ +--- +layout: cluedin +nav_order: 4 +parent: Enricher +grand_parent: Preparation +permalink: /preparation/enricher/azure-openai +title: Azure OpenAI +--- +## On this page +{: .no_toc .text-delta } +- TOC +{:toc} + +This article explains how to add the Azure OpenAI enricher. The purpose of this enricher is to enhance data quality by providing more complete, current, and detailed information for your golden records. It can automate the process of data research, offering up-to-date intelligence on your records and reducing the need for manual efforts. + +The Azure OpenAI enricher supports the following endpoints: + +- `{baseUrl}/openai/deployments/{deploymentName}/completions?api-version=2022-12-01` + +- `{baseUrl}/openai/deployments/deploymentName}/chat/completions?api-version=2024-06-01` + +You can instruct the Azure OpenAI enricher to enhance your golden records with the help of prompts. These prompts must contain at least: + +- One input property – the vocabulary key that contains the information you provide to an AI model to process and generate a response. + +- One output property – the vocabulary key where the desired result will be stored. + +Here are some examples of prompts: + +- Finding the country based on address: + + ``` + Using the address in {Vocabulary:organization.address} provide the country of the organization in {output:vocabulary:organization.CountryAI} + ``` + +- Translating the text from one language to another: + + ``` + Please get {output:vocabulary:organization.japaneseName} by translating {Vocabulary:organization.name} into Japanese. + ``` + +- Generating the summary or description based on some text: + + ``` + Generate a brief summary {output:vocabulary:website.SummaryAI} based on {Vocabulary:website.WebsiteDescription} and {Vocabulary:website.Title} + ``` + +## Add Azure OpenAI enricher + +To use the Azure OpenAI enricher, you need to have an [Azure OpenAI Service](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/create-resource?pivots=web-portal) resource set up in the Azure portal and provide the necessary credentials for that resource in CluedIn. + +**To configure Azure OpenAI integration in CluedIn** + +1. Go to **Administration** > **Azure Integration** > **Azure AI Services**. + +1. Enter the **API Key** used to authenticate and authorize access to your Azure OpenAI resource. + +1. Enter the **Base URL** for your Azure OpenAI resource in the format similar to the following: `https://{resource-name}.openai.azure.com/`. + +1. Leave the **Resource Key** field empty. + +1. Enter the **Deployment Name** assigned to a specific instance of a model when it was deployed. + +1. Specify the number of **Maximum Requests** that can be sent to the Azure OpenAI deployment. + + ![azure-openai-enricher-1.png](../../assets/images/preparation/enricher/azure-openai-enricher-1.png) + +1. Select **Save**. + + Once the Azure OpenAI integration is configured in CluedIn, proceed to add the Azure OpenAI enricher. + +**To add Azure OpenAI enricher** + +1. On the navigation pane, go to **Preparation** > **Enrich**. Then, select **Add Enricher**. + +1. On the **Choose Enricher** tab, select **Azure OpenAI**, and then select **Next**. + + ![azure-openai-enricher-2.png](../../assets/images/preparation/enricher/azure-openai-enricher-2.png) + +1. On the **Configure** tab, provide the following details: + + 1. **AI Deployment Name** – enter the deployment name assigned to a specific instance of a model when it was deployed. This is the same deployment name as in **Administration** > **Azure Integration** > **Azure AI Services**. + + 1. **Accepted Business Domain** – enter the business domain to define which golden records will be enriched. + + 1. **Prompt** – enter the instruction for Azure OpenAI to generate results. The prompt requires at least one input (e.g., `Vocabulary:XXXX.YYYY`) and one output (e.g., `output:Vocabulary:PPPP.QQQQ`). For example, the following prompt asks Azure OpenAI to translate the company name and store the result in a dedicated vocabulary key: + + ``` + Translate {Vocabulary:trainingcompany.name} into French and put the output into {output:vocabulary:trainingcompany.frenchName}. + ``` + + ![azure-openai-enricher-3.png](../../assets/images/preparation/enricher/azure-openai-enricher-3.png) + +1. Select **Test Connection** to make sure the enricher is properly configured, and then select **Add**. + + The Azure OpenAI enricher is added and has an active status. This means that it will enrich relevant golden records during processing or when you trigger external enrichment. + +After the Azure OpenAI enricher is added, you can modify its details: + +- **Settings** – add a user-friendly display name, select the description for data coming from the enricher, and define the source quality for determining the winning values. + +- **Authentication** – modify the details you provided while configuring the enricher. + +## Properties from Azure OpenAI enricher + +You can find the properties added to golden records from the Azure OpenAI enricher on the **Properties** page. The vocabulary keys added to golden records by the Azure Open AI enricher are grouped under **No Source** source type. + +![azure-openai-enricher-4.png](../../assets/images/preparation/enricher/azure-openai-enricher-4.png) + +For a more detailed information about the changes made to a golden record by the Azure OpenAI enricher, check the corresponding data part on the **History** page. + +![azure-openai-enricher-5.png](../../assets/images/preparation/enricher/azure-openai-enricher-5.png) \ No newline at end of file diff --git a/docs/080-management/020-deduplication.md b/docs/080-management/020-deduplication.md index 1587838b..9c32811c 100644 --- a/docs/080-management/020-deduplication.md +++ b/docs/080-management/020-deduplication.md @@ -10,7 +10,7 @@ permalink: /management/deduplication The goal of deduplication is to eliminate duplicate records by merging them together into a single, accurate, and consolidated golden record. This process maintains full traceability, allowing you to identify contributing records for the resulting golden record and providing the possibility to revert changes if necessary. {:.important} -You can reduce the number of duplicates in the system proactively even before creating a deduplication project. For this purpose, CluedIn provides the possibility of merging by codes: those [data parts](/key-terms-and-features/data-life-cycle) that have identical entity origin codes or entity codes are merged during processing. For more information, see [Codes](/integration/review-mapping#codes). +You can reduce the number of duplicates in the system proactively even before creating a deduplication project. For this purpose, CluedIn provides the possibility of merging by identifiers: those [data parts](/key-terms-and-features/data-life-cycle) that have identical primary identifiers or additional identifiers are merged during processing. For more information, see [Identifiers](/integration/review-mapping#identifiers). The following diagram shows the basic steps for merging duplicates in CluedIn. diff --git a/docs/080-management/040-entity-type.md b/docs/080-management/040-entity-type.md index 944620f3..9ff6bc91 100644 --- a/docs/080-management/040-entity-type.md +++ b/docs/080-management/040-entity-type.md @@ -1,85 +1,80 @@ --- layout: cluedin -title: Entity type +title: Business domain parent: Management nav_order: 050 has_children: false permalink: /management/entity-type -tags: ["management","entity type"] --- ## On this page {: .no_toc .text-delta } - TOC {:toc} -An entity type is a well-known business object that describes the semantic meaning of golden records. Entity types can represent physical objects, locations, interactions, individuals, and more. +A business domain is a well-known business object that describes the semantic meaning of golden records. Business domains can represent physical objects, locations, interactions, individuals, and more. -In CluedIn, all golden records must have an entity type to ensure the systematic organization and optimization of data management processes. A well-named entity type is global (for example, Person, Organization, Car) and should not be changed across sources. In this article, you will learn how to create and manage entity types to enhance the efficiency and organization of golden records in CluedIn. +In CluedIn, all golden records must have a business domain to ensure the systematic organization and optimization of data management processes. A well-named business domain is global (for example, Person, Organization, Car) and should not be changed across sources. In this article, you will learn how to create and manage business domains to enhance the efficiency and organization of golden records in CluedIn. -## Entity type details page +## Business domain details page -On the entity type details page, you can view relevant information about the entity type and take other actions to manage it. +On the business domain details page, you can view relevant information about the business domain and take other actions to manage it. **Data** -This tab contains all golden records that belong to the entity type. +This tab contains all golden records that belong to the business domain. **Vocabularies** -This tab contains all vocabularies that are associated with the entity type. +This tab contains all vocabularies that are associated with the business domain. **Configuration** -This tab contains general information about the entity type, including: +This tab contains general information about the business domain, including: -- Display name – a user-friendly identifier of the entity type that is displayed throughout the system (for example, in rules, search, streams, and so on). +- Display name – a user-friendly identifier of the business domain that is displayed throughout the system (for example, in rules, search, streams, and so on). -- Entity type code – a string that represents the entity type in code (for example, in clues). +- business domain identifier – a string that represents the business domain in code (for example, in clues). -- Icon – a visual representation of the entity type that helps you quickly identify what kind of golden record it is. +- Icon – a visual representation of the business domain that helps you quickly identify what kind of golden record it is. -- Path – a URL path of the entity type. +- Path – a URL path of the business domain. -- Layout – a way in which information is arranged on the **Overview** tab of the golden records that belong to the entity type. +- Layout – a way in which information is arranged on the **Overview** tab of the golden records that belong to the business domain. -## Create an entity type +## Create a business domain -CluedIn ships with some pre-defined values for common entity types such as Document, File, Organization. However, you might want to have some specific entity types that may not be configured by CluedIn. +CluedIn ships with some pre-defined values for common business domains such as Document, File, Organization. However, you might want to have some specific business domains that may not be configured by CluedIn. -Depending on the selected [data modeling approach](/management/data-catalog/modeling-approaches), you can create an entity type in two ways: +Depending on the selected [data modeling approach](/management/data-catalog/modeling-approaches), you can create a business domain in two ways: -- Automatically – this option is part of the data-first approach. When creating a mapping for a data set, you have the option to enter the name of a new entity type and select the icon. CluedIn will then automatically suggest the entity type. Once the mapping is created, you can then open the entity type and make any necessary adjustments. +- Automatically – this option is part of the data-first approach. When creating a mapping for a data set, you have the option to enter the name of a new business domain and select the icon. CluedIn will then automatically create the business domain. Once the mapping is created, you can then open the business domain and make any necessary adjustments. - ![create-entity-type-automatically.gif](../../assets/images/management/entity-type/create-entity-type-automatically.gif) +- Manually – this option is part of the model-first approach, which assumes that you need to create a business domain before using it in the mapping for a data set. The following procedure outlines the steps to manually create a business domain. -- Manually – this option is part of the model-first approach, which assumes that you need to create an entity type before using it in the mapping for a data set. The following procedure outlines the steps to manually create an entity type. +**To create a business domain** -**To create an entity type** +1. On the navigation pane, go to **Management** > **Business domains**. -1. On the navigation pane, go to **Management** > **Entity types**. +1. Select **Create business domain**. -1. Select **Create entity type**. +1. Enter the display name of the business domain. The **Business domain identifier** and the **Path** fields are filled in automatically based on the name that you entered. -1. Enter the display name of the entity type. The **Entity type code** and the **Path** fields are filled in automatically based on the name that you entered. - -1. Select the icon for the visual representation of the entity type. +1. Select the icon for the visual representation of the business domain. 1. Select the layout for arranging the information on the golden record overview page. 1. Select **Create**. - ![create-entity-type.gif](../../assets/images/management/entity-type/create-entity-type.gif) - - The entity type opens, where you can view and manage entity type details. + ![create-business-domain.png](../../assets/images/management/entity-type/create-business-domain.png) -## Manage an entity type + The business domain opens, where you can view and manage business domain details. -You can change the following elements of the entity type configuration: display name, icon, layout, and description. You cannot change the entity type code and path. Also, you cannot delete the entity type. +## Manage a business domain -**To edit entity type configuration** +You can change the following elements of the business domain configuration: display name, icon, layout, and description. You cannot change the business domain identifier and path. Also, you cannot delete the business domain. -1. In the upper-right corner of the entity type page, select **Edit**. +**To edit business domain configuration** -1. Make the needed changes, and then select **Save**. +1. In the upper-right corner of the business domain page, select **Edit**. - ![edit-entity-type.gif](../../assets/images/management/entity-type/edit-entity-type.gif) \ No newline at end of file +1. Make the needed changes, and then select **Save**. \ No newline at end of file diff --git a/docs/080-management/050-access.control.md b/docs/080-management/050-access.control.md index 2373a174..c554353d 100644 --- a/docs/080-management/050-access.control.md +++ b/docs/080-management/050-access.control.md @@ -7,7 +7,11 @@ has_children: true permalink: /management/access-control --- -Access control gives you a fine-grained control over who can view specific golden records and vocabulary keys. Together with source control, access control helps you configure reliable and secure access to data in CluedIn. For more information about the combination of source control and access control, see [Data access](/administration/user-access/data-access). +Access control gives you a fine-grained control over who can view and modify specific golden records. Together with source control, access control helps you configure reliable and secure access to golden records in CluedIn. For more information about the combination of source control and access control, see [Data access](/administration/user-access/data-access). + +
+ +
**Limitations of access control** @@ -21,3 +25,5 @@ This section covers the following topics: - [Create access control policy](/management/access-control/create-access-control-policy) – learn how to create and configure an access control policy. - [Manage access control policies](/management/access-control/manage-access-control-policies) – learn how to edit, deactivate, and delete an access control policy, as well how these actions affect user access to data. + +- [Access control reference](/management/access-control/access-control-reference) – learn about the structure of an access control policy and the available actions in the policy rule. diff --git a/docs/080-management/access-control/010-create-access-control-policy.md b/docs/080-management/access-control/010-create-access-control-policy.md index 56f9cfbb..ab680e2f 100644 --- a/docs/080-management/access-control/010-create-access-control-policy.md +++ b/docs/080-management/access-control/010-create-access-control-policy.md @@ -31,29 +31,33 @@ Now that the access control feature is enabled, you can create access control po 1. In the navigation pane, go to **Management** > **Access Control**. -1. Select **Create**. Enter the name of the policy, and then select **Create**. +1. Select **Create Policy**. + +1. Enter the name of the policy, and then select **Create Policy**. 1. In the **Filters** section, set up a filter to define the golden records to which the policy will apply. -1. In the **Rules** section, select **Add rule**, and then define the policy rule: +1. In the **Rules** section, select **Add Policy Rule**, and then define the policy rule: 1. Enter the name of the policy rule. - 1. In **Action**, select **Allow Access**. + 1. (Optional) In **Conditions**, set up additional criteria on top of the filters to define which golden records will be affected by the specific policy rule. - 1. In **Members**, select the users or roles to which the policy rule should apply. + 1. In **Action**, select the type of access to golden record properties: view, mask, or add/edit. Learn more in [Access control reference](/management/access-control/access-control-reference). - 1. If you want to allow access to all vocabulary keys in the golden records, select the **All Vocabulary Keys** checkbox. + 1. In **Members**, select the users or roles to which the policy rule will apply. - 1. If you want to allow access only to specific vocabulary keys in the golden records, select the needed vocabulary keys or vocabularies. + 1. If you want to allow access to all vocabulary keys in the golden records affected by the access control policy, select the **All Vocabulary Keys** checkbox. - 1. Select **Add rule**. + 1. If you want to allow access only to specific vocabularies or vocabulary keys in the golden records affected by the access control policy, select the needed vocabulary keys or vocabularies. - You can add multiple policy rules. + ![add-policy-rule.png](../../assets/images/management/access-control/add-policy-rule.png) -1. Save your changes, and then turn on the status toggle to activate the policy. + 1. Select **Add Policy Rule**. + + You can add multiple policy rules. - ![create-access-control-policy.gif](../../assets/images/management/access-control/create-access-control-policy.gif) +1. Save your changes and then turn on the status toggle to activate the policy. It might take up to 1 minute to apply the policy across the system. diff --git a/docs/080-management/access-control/020-manage-access-control-policies.md b/docs/080-management/access-control/020-manage-access-control-policies.md index 6f5731ec..03dfe2fc 100644 --- a/docs/080-management/access-control/020-manage-access-control-policies.md +++ b/docs/080-management/access-control/020-manage-access-control-policies.md @@ -13,19 +13,17 @@ title: Manage access control policies In this article, you will learn how to manage an access control policy and modify it when changes are required. -## Edit a policy +## Edit a policy rule -If you need to change something in the access control policy—for example, add more users to the rule or allow access to more vocabulary keys—you can easily do that by editing the policy. +If you need to change something in the access control policy—for example, add more users to the policy rule or add more vocabulary keys—you can easily do that by editing the policy rule. -**To edit an access control policy** +**To edit an access control policy rule** 1. In the navigation pane, go to **Management** > **Access Control**. -1. Find and open the policy that you want to edit. +1. Find and open the access control policy that you want to edit. -1. Make the needed changes, and then save them. - - ![edit-access-control-policy.gif](../../assets/images/management/access-control/edit-access-control-policy.gif) +1. Make the needed changes and then save them. The changes in the access control policy will be applied right away. @@ -37,7 +35,7 @@ There are two ways to activate and deactivate a policy: - From the list of policies – this option allows you to activate or deactivate multiple policies at once. -- From the rule details page. +- From the policy details page. **To activate or deactivate the policy from the list of policies** @@ -45,7 +43,7 @@ There are two ways to activate and deactivate a policy: 1. Near the upper-right corner, select the needed action. - ![deactivate-access-control-policy.gif](../../assets/images/management/access-control/deactivate-access-control-policy.gif) + ![activate-deactive-policy.png](../../assets/images/management/access-control/activate-deactive-policy.png) **To activate or deactivate the policy from the policy details page** @@ -69,6 +67,4 @@ There are two ways to delete a policy: **To delete a policy from the policy details page** -- Select the delete icon. Then, confirm your choice. - - ![delete-access-control-policy.gif](../../assets/images/management/access-control/delete-access-control-policy.gif) \ No newline at end of file +- Select the delete icon. Then, confirm your choice. \ No newline at end of file diff --git a/docs/080-management/access-control/030-access-control-reference.md b/docs/080-management/access-control/030-access-control-reference.md new file mode 100644 index 00000000..b31c64df --- /dev/null +++ b/docs/080-management/access-control/030-access-control-reference.md @@ -0,0 +1,85 @@ +--- +layout: cluedin +nav_order: 3 +parent: Access control +grand_parent: Management +permalink: /management/access-control/access-control-reference +title: Access control reference +--- +## On this page +{: .no_toc .text-delta } +- TOC +{:toc} + +In this article, you will find reference information about the structure of an access control policy and the actions available in the access control policy rule. + +## Access control policy structure + +Access control policy is a mechanism that allows you to precisely define access to golden records and their properties. The access control policy consists of several building blocks. + +![access-control-policy-structure.png](../../assets/images/management/access-control/access-control-policy-structure.png) + +- **Filters** – conditions to define which golden records the access control policy applies to. + +- **Policy rule** – an object that grants specific users and/or roles a particular type of access to golden records that match the rule's filters and conditions. The policy rule contains the following settings: + + - **Conditions** – additional criteria on top of the filters to define which golden records are affected by the specific policy rule. + + - **Action** – operation to allow a specific type of access to golden records that match the rule's filters and conditions. + + - **Members** – users and/or roles to whom the policy rule's action is applied. + + - **All Vocabulary Keys** – a checkbox that, if selected, means that the policy rule's action is applied to all vocabulary keys in golden records that match the rule's filters and conditions. + + - **Vocabulary Keys** – a field where you can select specific vocabulary keys to which the policy rule's action will be applied. + + - **Vocabularies** – a field where you can select specific vocabularies to which the policy rule's action will be applied. + + You need to select specific vocabulary keys or vocabularies to which the policy rule will be applied. If no vocabulary keys or vocabularies are selected, the members of the policy rule won't be able to view golden records at all. + +## Access control policy rule actions + +Access control policy rule actions define the type of access to the properties in golden records that match the rule's filters and conditions. + +![access-control-policy-rule-actions.png](../../assets/images/management/access-control/access-control-policy-rule-actions.png) + +### View + +This action gives access to view the values of specific vocabulary keys in golden records that match the rule's filters and conditions. This action does not give access to edit the existing properties of a golden record or add new properties. That is why the members of the policy rule with the view action will see the icon indicating that the value cannot be edited. + +![access-control-policy-rule-actions-view.png](../../assets/images/management/access-control/access-control-policy-rule-actions-view.png) + +If the users are not granted access to specific vocabularies or vocabulary keys that are used in golden records, they will see the **No value** text on the search page for certain properties. This means that either the property does not exist in a specific golden record or that the users do not have access to the property according to access control. + +![access-control-policy-rule-actions-no-value.png](../../assets/images/management/access-control/access-control-policy-rule-actions-no-value.png) + +If you want to indicate that a golden record has a specific property without displaying this property to users, create a policy rule with the mask action. + +### Mask + +This action hides the values of specific vocabulary keys in golden records that match the rule's filters and conditions. The members of the policy rule with the mask action will see the icon indicating that the value is masked due to access control policy. Masked value cannot be retrieved in any way, and it cannot be edited. + +![access-control-policy-rule-actions-mask.png](../../assets/images/management/access-control/access-control-policy-rule-actions-mask.png) + +{:.important} +The [mask value action](/management/rules/rules-reference) in data part and golden record rules will be deprecated in future releases. Therefore, use the mask action in access control policy rules. + + +### Add/edit + +This action allows the members of the policy rule to add new properties to the golden record and/or edit the existing properties in the golden record. The properties that the members can add/edit depend on the vocabularies and/or vocabulary keys selected in the policy rule: + +- If the policy rule is applied to all vocabulary keys, then the members can add or edit all vocabulary keys that are used in golden records that match the rule's filters and conditions. + +- If policy rule is applied to specific vocabulary keys, then the members can add or edit only the selected vocabulary keys that are used in golden records that match the rule's filters and conditions. + +- If the policy rule is applied to specific vocabularies, then the members can add or edit any vocabulary keys belonging to the specified vocabularies that are used in golden records that match the rule's filters and conditions. + +{:.important} +The add/edit action takes precedence over the view action. If a user is a member of both a policy rule with the view action and a policy rule with the add/edit action, and both rules apply to the same vocabulary keys, it means that the user will be able to add or edit those vocabulary keys. + +**Add/edit action and RACI permissions** + +When adding a property to the golden record, you might want to create a new value. This is possible only if you have the needed RACI permissions: either you are the owner of the vocabulary or you have the Consulted access to the management.datacatalog claim. + +When you are trying to add or edit a property in a golden record, CluedIn first checks if you have the required RACI permissions and then it checks if you are allowed to make changes to that specific golden record according to the access control policy. If your access level is less than Consulted, you will not be able to add or edit the property in a golden record. \ No newline at end of file diff --git a/docs/080-management/data-catalog/010-modeling-approaches.md b/docs/080-management/data-catalog/010-modeling-approaches.md index a92c9aa8..969c1ff8 100644 --- a/docs/080-management/data-catalog/010-modeling-approaches.md +++ b/docs/080-management/data-catalog/010-modeling-approaches.md @@ -15,7 +15,7 @@ In this article, you will learn about two data modeling approaches in CluedIn: a ## Data-first approach -The data-first approach focuses on agility, flexibility, and faster time-to-value. It provides the opportunity to ingest your data first and then dynamically create an entity type and vocabulary as needed. +The data-first approach focuses on agility, flexibility, and faster time-to-value. It provides the opportunity to ingest your data first and then dynamically create a business domain and vocabulary as needed. Unlike a predefined schema, the data-first approach uses flexible data structures to adapt to changing requirements. This allows your data models to iteratively evolve based on feedback, data analysis, and the dynamic nature of business needs. diff --git a/docs/080-management/data-catalog/020-vocabulary.md b/docs/080-management/data-catalog/020-vocabulary.md index 4a0e9e8c..09c15183 100644 --- a/docs/080-management/data-catalog/020-vocabulary.md +++ b/docs/080-management/data-catalog/020-vocabulary.md @@ -31,7 +31,7 @@ This tab contains general information about the vocabulary, including: - Vocabulary name – a user-friendly identifier of the vocabulary. -- Primary entity type – an entity type linked with the vocabulary. +- Primary business domain – a business domain linked with the vocabulary. - Source – a source system that indicates the origin of data within the vocabulary. @@ -65,7 +65,12 @@ This tab contains tasks for reviewing changes to the vocabulary submitted by use **Audit log** -This tab contains a detailed history of changes to the vocabulary. +This tab contains a detailed history of changes to the vocabulary, such as: + +- Create a vocabulary +- Added user to owners +- Create a vocabulary key +- Delete a vocabulary key ## Create a vocabulary @@ -73,8 +78,6 @@ Depending on the selected [data modeling approach](/management/data-catalog/mode - **Automatically** – this option is part of the data-first approach. When [creating a mapping](/integration/create-mapping) for a data set, you have the option to enter the name of a new vocabulary. CluedIn will then automatically suggest the key prefix and generate the vocabulary. Once the mapping is created, you can then open the vocabulary and make any necessary adjustments. - ![create-vocabulary-mapping.gif](../../assets/images/management/data-catalog/create-vocabulary-mapping.gif) - - **Manually** – this option is part of the model-first approach, which assumes that you need to create a vocabulary before using it in the mapping for a data set. The following procedure outlines the steps to manually create a vocabulary. **To create a vocabulary** @@ -85,7 +88,7 @@ Depending on the selected [data modeling approach](/management/data-catalog/mode 1. Enter the name of the vocabulary. -1. Find and select the primary entity type that will most likely use the vocabulary. +1. Find and select the primary business domain that will most likely use the vocabulary. 1. (Optional) Find and select the source of the vocabulary to indicate where the data in the vocabulary comes from. @@ -106,7 +109,7 @@ Depending on the selected [data modeling approach](/management/data-catalog/mode Once the vocabulary is created, you can edit its configuration based on your requirements to ensure the maintenance of organized and consistent metadata. -Only Vocabulary Owners and Administrators can edit the vocabulary configuration. When you're editing a vocabulary configuration, you can change almost all of its aspects: name, primary entity type, source, and description. +Only Vocabulary Owners and Administrators can edit the vocabulary configuration. When you're editing a vocabulary configuration, you can change almost all of its aspects: name, primary business domain, source, and description. **To edit vocabulary configuration** diff --git a/docs/080-management/data-catalog/040-search-data-catalog.md b/docs/080-management/data-catalog/040-search-data-catalog.md index e8c480b4..5616b417 100644 --- a/docs/080-management/data-catalog/040-search-data-catalog.md +++ b/docs/080-management/data-catalog/040-search-data-catalog.md @@ -23,7 +23,7 @@ You can use the search box to find the vocabulary you need. Enter either the ful You can filter the vocabularies using the filter pane in the upper-right corner of the page. The following filters are available: -- Entity type – filters vocabularies based on their primary entity type. By default, all entity types are selected. To narrow down your search results, you can opt for a specific entity type. +- Business domain – filters vocabularies based on their primary business domain. By default, all business domains are selected. To narrow down your search results, you can opt for a specific business domain. - Integrations – filters vocabularies based on their source. By default, all integrations are selected. To narrow your search results, you can select an option to display vocabularies that are not associated with a specific source, or you can select a specific source. diff --git a/docs/080-management/deduplication/030-create-deduplication-project.md b/docs/080-management/deduplication/030-create-deduplication-project.md index d0d93699..5b101409 100644 --- a/docs/080-management/deduplication/030-create-deduplication-project.md +++ b/docs/080-management/deduplication/030-create-deduplication-project.md @@ -19,7 +19,7 @@ Before creating a deduplication project, take the following aspects into account - **Number of golden records** that you want to check for duplicates. For an extensive set of data with hundreds of thousands or millions of records, use advanced filters to narrow down the number of records in the project to test your configuration. When you are satisfied with the configuration, you can then modify the filters and run the project on the entire set of data. - For example, if you want to deduplicate all golden records of the Company entity type, start by narrowing down the companies to a specific country. To do this, apply two filter rules: one to identify all golden records of the Company entity type, and another to find all golden records that match the specific country. This approach allows you to refine your matching rules on a targeted subset of data. When you are satisfied with the configuration, simply remove the filter rule for the specific country and run the project for all golden records of the Company entity type. + For example, if you want to deduplicate all golden records of the Company business domain, start by narrowing down the companies to a specific country. To do this, apply two filter rules: one to identify all golden records of the Company business domain, and another to find all golden records that match the specific country. This approach allows you to refine your matching rules on a targeted subset of data. When you are satisfied with the configuration, simply remove the filter rule for the specific country and run the project for all golden records of the Company business domain. - **Matching functions** that you want to use for detecting duplicates. To ensure a faster deduplication process and make it easier to revert merges, create separate projects for equality matching functions and fuzzy matching functions. You can create multiple deduplication projects. @@ -33,19 +33,17 @@ Before creating a deduplication project, take the following aspects into account 1. In the **Choose project type** section, select an option for identifying the golden records that you want to deduplicate: - - **By entity type** – select the entity type; all golden records belonging to the selected entity type will be checked for duplicates. You can add multiple entity types. This is useful when you want to run a deduplication project across similar entity types. + - **By business domain** – select the business domain; all golden records belonging to the selected business domain will be checked for duplicates. You can add multiple business domain. This is useful when you want to run a deduplication project across similar business domain. - **Using advanced filters** – add filter rules; all golden records that meet the filter criteria will be checked for duplicates. You can add multiple filter rules. Read more about filters [here](/key-terms-and-features/filters). This option is useful when you are working with a large set of data. You can narrow down the number of golden records and run a deduplication project on a sample set of data to make sure your matching rules work correctly. When you are confident in the effectiveness of your configuration with the sample set, you can then modify the filters and run the project on a larger set of data. -1. (Optional) In the **Description** section, enter the details about the deduplication project. + ![create-dedup-project.png](../../assets/images/management/deduplication/create-dedup-project.png) -1. In the upper-right corner, select **Create**. +1. Select **Create**. - ![сreate-project.gif](../../assets/images/management/deduplication/сreate-project.gif) - - After you create a deduplication project, add matching rules for detecting duplicates among golden records that correspond to the selected entity type or filter criteria. + After you create a deduplication project, add matching rules for detecting duplicates among golden records that correspond to the selected business domain or filter criteria. ## Add a matching rule @@ -71,6 +69,8 @@ In the project, matching rules are combined using the **OR** logical operator, w 1. Choose the [matching function](/management/deduplication/deduplication-reference#matching-functions) for detecting duplicates. + ![add-matching-rule-criteria.png](../../assets/images/management/deduplication/add-matching-rule-criteria.png) + 1. (Optional) Select the [normalization rules](/management/deduplication/deduplication-reference#normalization-rules) to apply during duplicate detection. These rules are temporarily applied solely for the purpose of identifying duplicates. For example, selecting **To lower-case** means that the system will convert values to lower case before comparing them to identify duplicates. 1. Select **Next**. @@ -83,6 +83,4 @@ In the project, matching rules are combined using the **OR** logical operator, w - If you are satisfied with the matching rule configuration, select **Add Rule**. - ![add-rule.gif](../../assets/images/management/deduplication/add-rule.gif) - - After you add matching rules, [generate matches](/management/deduplication/manage-a-deduplication-project#generate-matches) to detect duplicates among golden records that correspond to the selected entity type or filter criteria. \ No newline at end of file + After you add matching rules, [generate matches](/management/deduplication/manage-a-deduplication-project#generate-matches) to detect duplicates among golden records that correspond to the selected business domain or filter criteria. \ No newline at end of file diff --git a/docs/080-management/deduplication/040-manage-a-deduplication-project.md b/docs/080-management/deduplication/040-manage-a-deduplication-project.md index 5453a38a..85c48903 100644 --- a/docs/080-management/deduplication/040-manage-a-deduplication-project.md +++ b/docs/080-management/deduplication/040-manage-a-deduplication-project.md @@ -62,7 +62,7 @@ You can edit a deduplication project only when its [status](/management/deduplic Editing a deduplication project involves two aspects: -- Editing the project configuration: project name, entity type or advanced filters, and description. For example, if you used advanced filters to narrow down the number of golden records for the project and you have reached the desired matching rules configuration, you can now modify the filters and run the project on the entire set of data. +- Editing the project configuration: project name, business domain, or advanced filters, and description. For example, if you used advanced filters to narrow down the number of golden records for the project and you have reached the desired matching rules configuration, you can now modify the filters and run the project on the entire set of data. - Editing the matching rules configuration: change rule name; [add rule](/management/deduplication/create-a-deduplication-project#add-a-matching-rule); deactivate or delete the rule; modify matching criteria (edit, delete, add). diff --git a/docs/080-management/deduplication/060-deduplication-reference.md b/docs/080-management/deduplication/060-deduplication-reference.md index 942dfed5..d62fa930 100644 --- a/docs/080-management/deduplication/060-deduplication-reference.md +++ b/docs/080-management/deduplication/060-deduplication-reference.md @@ -163,3 +163,34 @@ The following table provides descriptions of the statuses of groups of duplicate The following diagram shows the group of duplicates workflow along with its statuses and main activities. ![group-of-duplicates-status-workflow.gif](../../assets/images/management/deduplication/group-of-duplicates-status-workflow.gif) + +## Deduplication project audit log actions + +Whenever some changes or actions are made in the deduplication project, they are recorded and can be found on the **Audit Log** tab. These actions include the following: + +- Create a deduplication project +- Add users to owners +- Update a deduplication project +- Create a deduplication rule +- Update a deduplication rule +- Activate a matching rule +- Deactivate a matching rule +- Generate matches +- Discard matches +- Manual conflict resolution +- Reset manual conflict resolutions +- Remove entity from a group +- Approve one group +- Approve multiple groups +- Approve all groups +- Remove (revoke) approval from all groups +- Remove (revoke) approval from groups (one or multiple groups) +- Reject groups (one or multiple groups) +- Merge groups (one or multiple selected groups) +- Merge approved groups +- Undo merged entities +- Undo merge groups +- Abort undo +- Cancel generating matches +- Cancel merge +- Archive a deduplication project \ No newline at end of file diff --git a/docs/080-management/rules/040-power-fx-formulas-in-rules.md b/docs/080-management/rules/040-power-fx-formulas-in-rules.md new file mode 100644 index 00000000..c7d45447 --- /dev/null +++ b/docs/080-management/rules/040-power-fx-formulas-in-rules.md @@ -0,0 +1,115 @@ +--- +layout: cluedin +nav_order: 4 +parent: Rules +grand_parent: Management +permalink: /management/rules/power-fx-formulas +title: Rules reference +last_modified: 2025-07-17 +--- +## On this page +{: .no_toc .text-delta } +- TOC +{:toc} + +In this article, you will learn about Power Fx formulas that you can use in the Rule Builder to set up filters, conditions, and actions in the Rule Builder. + +
+ +
+ +Power Fx is a general-purpose, low-code, strong-typed, and functional programming language developed by Microsoft. Power Fx enables you to build and customize applications, workflows, and other solutions by writing simple, declarative expressions rather than traditional code. You can work with Power Fx using _Excel-like formulas_, which makes it intuitive both for technical and business users. For more information, see [Microsoft Power Fx overview](https://learn.microsoft.com/en-us/power-platform/power-fx/overview). + +In CluedIn, you can use Power Fx formulas in the Rule Builder for querying, equality testing, decision making, type conversion, and string manipulation based on the supported properties of a data part or a golden record. + +## Power Fx formulas in rules + +You can use Power Fx formulas in the following types of rules: + +- Data part rules – formulas are available in filters and actions (conditions and formula action). + +- Survivorship rules – formulas are available in filters and actions (conditions). The formula action is not available in survivorship rules. + +- Golden record rules – formulas are available in filters and actions (conditions and formula action). + +In CluedIn, a Power Fx formula requires a _context_ to work with. It is called an `Entity`, and it represents a data part or a golden record. Essentially, it is a global variable that holds a sandboxed version of a data part or a golden record in CluedIn. + +![power-fx-formula-example.png](../../assets/images/management/rules/power-fx-formula-example.png) + +Consider the following example of a formula. + +``` +Right(Entity.Name, 1) = "t" +``` + +This formula consists of the following elements: + +- `Right` – a built-in function that retrieves the rightmost _x_ characters in a string. + +- `Entity` – a context that the formula is working against. + +- `Name` – a string property of `Entity`. + +- `1` – the number of characters to retrieve from the right end of the string. + +- `= "t"` – the equality evaluator that checks if the retieved character is equal to the letter “t”. + +The formula checks if the rightmost character of `Entity.Name` is `t`. If it is, the formula returns `true`, meaning that the rule will be applied to a specific golden record. If the formula returns `false`, the rule will not be applied to a specific golden record. + +**Custom functions** + +Custom CluedIn functions are designed to help you with querying and setting data for a data part or a golden record. Custom functions include the following: + +- `AddTag` – adds a tag to the golden record's tag collection. This function is analogous to the Add Tag rule action. + +- `GetVocabularyKeyValue` – gets a value from the golden record's properties if such value exists; otherwise, it returns `Empty` (null). + +- `SetEntityProperty` – sets a golden record metadata property (for example, `Created Date`, `Aliases`, `Description`). + +- `SetVocabularyKeyValue` – sets or adds a vocabulary key to the golden record's properties. + +## Power Fx formula examples + +This section contains some examples of Power Fx formulas in rules. + +**Set contract status to “Expiring Soon” if the contract ends within 5 days; otherwise, set it to “Active”** + +``` +SetVocabularyKeyValue(Entity, "finance.contractStatus", If(DateDiff(Today(), GetVocabularyKeyValue(Entity, "finance.contractEndDate")) < 5, "Expiring Soon", "Active")) +``` + +**Set salary grade to “Above Target” if the salary is higher than the target salary; otherwise, set it to “Below Target”** + +``` +SetVocabularyKeyValue(Entity, "finance.salaryGrade", If(Value(GetVocabularyKeyValue(Entity, "finance.salary")) > Value(GetVocabularyKeyValue(Entity, "finance.targetSalary")), "Above Target", "Below Target")) +``` + +**Set a vocabulary key value to a date using type conversion and formatting it to ISO format** + +``` +SetVocabularyKeyValue(Entity, "user.startDate", Text(DateValue(GetVocabularyKeyValue(Entity, "user.startDate")), "yyyy-MM-ddTHH:mm:ssZ")) +``` + +**Set full name to the combination of first name and last name, separated by a space** + +``` +SetVocabularyKeyValue(Entity, "employee.fullName", GetVocabularyKeyValue(Entity, "employee.firstName") & " " & GetVocabularyKeyValue(Entity, "employee.lastName")) +``` + +**Add a tag** + +``` +AddTag(Entity, "ThisIsATag") +``` + +**Check if the number of rows on a table/collection equals a value** + +``` +CountRows(Entity.OutgoingEdges) = 1 +``` + +**Set a golden record property to a value** + +``` +SetEntityProperty(Entity, "Encoding", "utf-8") +``` \ No newline at end of file diff --git a/docs/080-management/rules/040-rules-reference.md b/docs/080-management/rules/040-rules-reference.md index 289749d5..80dabce5 100644 --- a/docs/080-management/rules/040-rules-reference.md +++ b/docs/080-management/rules/040-rules-reference.md @@ -25,11 +25,11 @@ The following table contains a list of actions that you can use within data part | Add Tag | Adds a tag to the record. You need to specify the tag that you want to add. | | Add Value | Adds a value to the vocabulary key. You can select the existing value or create a new value. Use this action when the vocabulary key doesn't contain any value. | | Add Value with CluedIn AI | Adds a value to the property or vocabulary key according to your prompt. For example, you can check if the email address in the record is a personal address or business address. | -| Change Entity Type | Changes the entity type of the record. You can select the existing entity type or create a new entity type. | +| Change Business Domain | Changes the business domain of the record. You can select the existing business domain or create a new business domain. | | Copy Value | Copies the value from one field (source field) to another (target field). | | Delete Value | Deletes the value that you select. | | Evaluate Regex | Finds values that match a regular expression and replaces them with the needed value. | -| Mask Value | Applies a mask to the value. You can use this action to hide sensitive data. | +| Mask Value | Applies a mask to the value. You can use this action to hide sensitive data. Note that this action will be deprecated in future releases; therefore, use the [mask action](/management/access-control/access-control-reference#) in access control policy rules instead. | | Move Value | Moves the value from one field (source field) to another (target field). | | Normalize Date | Converts the values of the vocabulary key to ISO 8601 format (YYYY-MM-DDT00:00:00+00:00). You can enter the culture or input format to tell CluedIn that you expect dates for the specified vocabulary key to be in a certain way. If you don't enter the culture or input format, CluedIn will analyze the values and determine their date format on its own before converting them to ISO 8601 format. This action gives you more control over how dates are interpreted before they are converted to ISO 8601 format.
**Important!** To use this action, the **Date Time** option must be disabled in **Administration** > **Settings** > **Processing Property Data Type Normalization**. For dates that have already been converted when the **Date Time** option was enabled, the Normalize Date rule action has no effect at all because the dates are already normalized. | | Remove Alias | Removes alias from the record. You need to specify the alias that you want to remove. | @@ -70,8 +70,8 @@ The following table contains a list of actions that you can use within golden re | Add Tag | Adds a tag to the record. You need to specify the tag that you want to add. | | Add Value | Adds a value to the vocabulary key. You can select the existing value or create a new value. Use this action when the vocabulary key doesn't contain any value. | | Add Value with CluedIn AI | Adds a value to the property or vocabulary key according to your prompt. For example, you can check if the email address in the record is a personal address or business address. | -| Change Entity Type | Changes the entity type of the record. You can select the existing entity type or create a new entity type. | -| Mask Value | Applies a mask to the value. You can use this action to hide sensitive data. | +| Change Business Domain | Changes the business domain of the record. You can select the existing business domain or create a new business domain. | +| Mask Value | Applies a mask to the value. You can use this action to hide sensitive data. Note that this action will be deprecated in future releases; therefore, use the [mask action](/management/access-control/access-control-reference#) in access control policy rules instead. | | Normalize Date | Converts the values of the vocabulary key to ISO 8601 format (YYYY-MM-DDT00:00:00+00:00). You can enter the culture or input format to tell CluedIn that you expect dates for the specified vocabulary key to be in a certain way. If you don't enter the culture or input format, CluedIn will analyze the values and determine their date format on its own before converting them to ISO 8601 format. This action gives you more control over how dates are interpreted before they are converted to ISO 8601 format.
**Important!** To use this action, the **Date Time** option must be disabled in **Administration** > **Settings** > **Processing Property Data Type Normalization**. For dates that have already been converted when the **Date Time** option was enabled, the Normalize Date rule action has no effect at all because the dates are already normalized. | | Remove Alias | Removes alias from the record. You need to specify the alias that you want to remove. | | Remove All Tags | Removes all tags from the record. | diff --git a/docs/110-key-terms-and-features/010-data-life-cycle.md b/docs/110-key-terms-and-features/010-data-life-cycle.md index e0a7ace7..b1a6cc38 100644 --- a/docs/110-key-terms-and-features/010-data-life-cycle.md +++ b/docs/110-key-terms-and-features/010-data-life-cycle.md @@ -56,7 +56,7 @@ The data part processing step involves the following actions: - Triggering the enrichment – to improve the clues by providing additional details from external services. -- Merging identical clues by codes – to reduce the number of duplicates in the system by merging clues that have identical entity origin codes or entity codes. For more information, see [Codes](/integration/review-mapping#codes). +- Merging identical clues by identifiers – to reduce the number of duplicates in the system by merging clues that have identical primary identifiers or additional identifiers. For more information, see [Identifiers](/integration/review-mapping#identifiers). During **golden record processing**, CluedIn analyzed a data part and determines whether it will produce a new golden record or aggregate into the existing golden record. During this step, the following actions take place: diff --git a/docs/110-key-terms-and-features/020-clue-reference.md b/docs/110-key-terms-and-features/020-clue-reference.md index 6407ebd6..da1df108 100644 --- a/docs/110-key-terms-and-features/020-clue-reference.md +++ b/docs/110-key-terms-and-features/020-clue-reference.md @@ -32,9 +32,9 @@ After you create the mapping, you can generate clues and take a look at their st A JSON file with all clues created from your data set is downloaded to your computer. You can open the file in any text editor. -It is a good idea to check the clues before processing to make sure they contain all the necessary details so that identical clues can be merged by codes, thus reducing the number of duplicates in the system. You can do the following: +It is a good idea to check the clues before processing to make sure they contain all the necessary details so that identical clues can be merged by identifiers, thus reducing the number of duplicates in the system. You can do the following: -- Check if the codes are correct. +- Check if the identifiers are correct. - Check if the property and pre-process rules have been applied as intended. @@ -98,20 +98,20 @@ The following table contains the properties that you can find in the clue. | Property | Description | |--|--| | `attribute-organization` | GUID of the organization. | -| `attribute-origin` | Entity origin code of the clue as defined in the mapping details. It is the primary identifier that uniquely represents the clue. This property appears in several places in the clue structure. | +| `attribute-origin` | Primary identifier of the clue as defined in the mapping details. It is the primary identifier that uniquely represents the clue. This property appears in several places in the clue structure. | | `attribute-appVersion` | Version of the clue schema. This property appears in several places in the clue structure. | | `attribute-originProviderDefinitionId` | GUID of the source that sent the data to CluedIn. The source can be a file, a database, an ingestion endpoint, a manual data entry project, and so on. Each source in CluedIn has a provider definition ID. The provider definition ID is used to restrict or grant user access to a specific data source. | | `attribute-inputSource` | System that pushed the clue. Usually, the clue is created as a result of the mapping process, which is represented by the `cluedin-annotation` service. | | `name` | Name of the clue. The name is shown on the search results page and on the golden record details page. | | `description` | Description of the clue. The description is shown in the default search results view and on the golden record details page. | -| `entityType` | A common attribute that defines the business domain that the clue belongs to. The selection of the entity type is part of the mapping process. | +| `entityType` | A common attribute that defines the business domain that the clue belongs to. The selection of the business domain is part of the mapping process. | | `attributeSource` | A service source of the clue. | | `codes` | Additional unique identifiers of the clue as defined in the mapping details. | -| `edges` | An object that represents a specific relation (`attribute-type`) between the source clue (`attribute-from`) and the target clue (`attribute-to`), identified through their entity origin codes. | +| `edges` | An object that represents a specific relation (`attribute-type`) between the source clue (`attribute-from`) and the target clue (`attribute-to`), identified through their primary identifiers. | | `properties` | An object (also called a _property bag_) that contains all the properties (vocabulary keys) of the clue. | | `attribute-type` | A format of data in the property bag. | | `aliases` | A value used as an alternative or secondary name associated with the clue. | -| `tags` | A value used as a label to categorize clues across entity types. | +| `tags` | A value used as a label to categorize clues across business domains. | | `createdDate` | Date when the record was created in the source system. | | `modifiedDate` | Date when the record was modified in the source system. | | `quarantine` | Metadata information about the rules that were applied to the clue to send it to quarantine. | @@ -132,7 +132,7 @@ To post clues to CluedIn, use the following post URL: `{{baseurl}}/public/api/v2 ### Delta clues and why you may need them -By default, CluedIn works in "Append" mode, where as new data comes in, it will append the data over the top of existing data that has a matching Entity Code or will create new Golden records where there is no matching Entity Code. +By default, CluedIn works in the append mode, where as new data comes in, it will append the data over the top of existing data that has a matching primary identifier or will create new golden records where there is no matching primary identifier. There are situations where you actually don't want your new data to be in "Append" mode, but rather you want to change the way that CluedIn will process and treat this data. @@ -142,7 +142,7 @@ The most common examples include: - You are sending data to CluedIn with "Blank Values" and your intention is to ask CluedIn to "forget" that this column or columns ever had a value for this. For example, you had a phone number from a company and you would rather now have a blank value for this as the phone number is no longer active. - - You are sending data to CluedIn with an updated value for a column and your intention is to ask CluedIn to **REMOVE** the old Entity Code or Edge that could have been built off of this, and **REPLACE** it with the new ones, not just **APPEND** over the top. + - You are sending data to CluedIn with an updated value for a column and your intention is to ask CluedIn to **REMOVE** the old identifier or edge that could have been built off of this, and **REPLACE** it with the new ones, not just **APPEND** over the top. - You are sending blank data to CluedIn and you actually want CluedIn to treat the values as Blank and hence if a Golden Record has a value of "Hello" you actually want to turn that value into "". @@ -152,7 +152,7 @@ The most common examples include: You can post clues that remove outgoing or incoming edges from golden records. Following is an example of a clue that removes an outgoing edge from a golden record. -### Remove Edge +### Remove edge ``` { @@ -210,7 +210,7 @@ You can post clues that remove outgoing or incoming edges from golden records. F } ``` -### Remove Edge and Add Edge in same Clue (equivelant of an Update) +### Remove edge and add edge in same clue (equivelant of an update) ``` { @@ -272,7 +272,7 @@ You can post clues that remove outgoing or incoming edges from golden records. F ``` -### Remove Incoming Edges using Delta Clues +### Remove incoming edges using delta clues ``` @@ -302,7 +302,7 @@ You can post clues that remove outgoing or incoming edges from golden records. F } ``` -### Remove Properties using Delta Clues +### Remove properties using delta clues ``` @@ -332,7 +332,7 @@ You can post clues that remove outgoing or incoming edges from golden records. F } ``` -### Remove Entity Code using Delta Clues +### Remove identifier using delta clues ``` @@ -362,7 +362,7 @@ You can post clues that remove outgoing or incoming edges from golden records. F } ``` -### Remove Tag using Delta Clues +### Remove tag using delta clues ``` @@ -392,7 +392,7 @@ You can post clues that remove outgoing or incoming edges from golden records. F } ``` -### Remove Alias using Delta Clues +### Remove alias using delta clues ``` @@ -422,7 +422,7 @@ You can post clues that remove outgoing or incoming edges from golden records. F } ``` -### Remove Description using Delta Clues +### Remove description using delta clues ``` @@ -451,7 +451,7 @@ You can post clues that remove outgoing or incoming edges from golden records. F } ``` -### Remove Name using Delta Clues +### Remove name using delta clues ``` @@ -480,7 +480,7 @@ You can post clues that remove outgoing or incoming edges from golden records. F } ``` -### Remove Display Name using Delta Clues +### Remove display name using delta clues ``` @@ -509,7 +509,7 @@ You can post clues that remove outgoing or incoming edges from golden records. F } ``` -### Remove Author using Delta Clues +### Remove author using delta clues ``` diff --git a/docs/110-key-terms-and-features/030-search.md b/docs/110-key-terms-and-features/030-search.md index 20c67f4d..1c257387 100644 --- a/docs/110-key-terms-and-features/030-search.md +++ b/docs/110-key-terms-and-features/030-search.md @@ -12,69 +12,107 @@ tags: ["search"] - TOC {:toc} -CluedIn allows you to search over all golden records in the platform. You can enter a keyword in the search box, and CluedIn will return all relevant results. You can also use [filters](/key-terms-and-features/filters) to precisely define the golden records you're looking for based on various criteria. In this article, you will learn how to use the search capabilities and customize the search results page to better suit your specific needs. +In this article, you will learn how to use the search capabilities in CluedIn to efficiently find the golden records you need.
+You can search through all golden records in CluedIn—just enter a keyword in the search box, and CluedIn will return all relevant results. Use [filters](/key-terms-and-features/filters) to precisely define the golden records you're looking for based on various criteria. + +## Search box + +The search box is the starting point of your search for golden records. This is where you can enter a key word to start the search. Additionally, this is where you can quickly retrieve your recent searches and saved searches by clicking in the search box. + +![search-box.png](../../assets/images/key-terms-and-features/search-box.png) + +The **Recent Searches** section displays up to 5 of your previous searches. To run a recent search, select it from the list. + +The **Saved Searches** section displays up to 7 saved searches. It contains a toggle to switch between your private saved searches and shared saved searches. In the screenshot above, the toggle is turned on to show private saved searches (**My Saved Searches**). Turning the toggle off will show public saved searches (**Shared Saved Searches**). To run a saved search, select it from the list. If you cannot find the saved search, select **View all saved searches**, and then look for the needed search in the **Saved Searches** pane. You can find more information in the article in [Saved searches](#saved-searches). + +In the search box, you can also select a business domain in which you want to search for golden records. To do this, expand the dropdown list and select the needed business domain. + +![search-box-business-domains.png](../../assets/images/key-terms-and-features/search-box-business-domains.png) + +The **Business Domains** dropdown list displays up to 8 business domains that contain the biggest number of golden records. You can also view the number of golden records per business domain. After you select a business domain, enter a keyword, and start to search, CluedIn will display only those golden records that match the keyword and belong to the selected business domain. + ## Search results page -By default, the search results page displays golden records in the tabular view in the following columns: Name, Entity Type, and Description. You can customize the search results page to focus on the information that is important to you. +By default, the search results page displays golden records in the tabular view in the following columns: **Name**, **Business Domain**, and **Description**. You can customize the search results page to focus on the information that is important to you. In this section, you will find different customization options. -![search-5.png](../../assets/images/key-terms-and-features/search-5.png) +![search-results-page-default.png](../../assets/images/key-terms-and-features/search-results-page-default.png) ### Add columns -If you want to see other properties or vocabulary keys, you can add the needed columns to the search results page. +If you want to view other golden record properties or vocabulary keys, you can add the corresponding columns to the search results page. **To add columns to the search results page** 1. On the search results page, select **Column Options**. + The **Column Options** pane opens where you can view the columns that are currently displayed on the search results page. + 1. Select **Add columns**, and then choose the type of column that you want to add to the search results page: - - **Entity Property** – this option allows you to select the following properties: Date Created, Discovery Date, or Date Modified. After you choose the needed properties, select **Save Selection**. + - **Entity Property** – to select the following golden record properties: **Date Created**, **Discovery Date**, or **Date Modified**. + + - **Vocabulary** – to find and select any vocabulary keys. + +1. Depending on the type of column that you selected, do one of the following: + + - For **Entity Property**, select the checkboxes next to the needed properties, and then select **Save Selection**. + + ![search-select-property-to-add-as-column.png](../../assets/images/key-terms-and-features/search-select-property-to-add-as-column.png) + + - For **Vocabulary**, do the following: - - **Vocabulary** – this option allows you to find and select any vocabulary keys. The following steps will guide you through the procedure of adding vocabulary keys to the search results page. + 1. In the **Vocabulary Keys** section, find the vocabulary that contains the vocabulary keys that you want to add to the search result page. By default, this section displays the vocabularies that are used in golden records. To limit the search results, use filters. You can choose to view the vocabularies that are associated with a specific business domain or the vocabularies from a specific integration. -1. In the search field, enter the name of the vocabulary and start the search. Then, select the needed vocabulary keys. + If you want to add all vocabulary keys from a specific vocabulary, select the checkbox next to the vocabulary name. If you want to add specific vocabulary keys, expand the vocabulary and select the checkboxes next to the needed vocabulary keys. - To limit the search results, use filters. You can choose to view the vocabulary keys of a specific data type or classification, or vocabulary from specific integrations. + 1. Move the vocabulary keys to the **Selected Vocabulary Keys** section using the arrow pointing to the right. -1. After you chose the needed vocabulary keys, select **Add Vocabulary Columns**. + If you decide that you do not want to add a specific vocabulary or vocabulary key, you can move it back to the **Vocabulary Keys** section. To do this, select the checkbox next to the needed element in the **Selected Vocabulary Keys** section, and then use the arrow pointing to the left. - The columns are added to the search results page. + 1. After you move all the needed vocabulary keys to the **Selected Vocabulary Keys** section, select **Add Vocabulary Columns**. - ![search-6.gif](../../assets/images/key-terms-and-features/search-6.gif) + ![search-select-vocab-to-add-as-column.png](../../assets/images/key-terms-and-features/search-select-vocab-to-add-as-column.png) -### Reorder columns + The columns are added to the search results page. You can close the **Column Options** pane. -If you want to improve the organization of information on the search results page, you can change the order of columns. +### Manage columns -**To reorder columns on the search results page** +If you want to improve the organization of information on the search results page, you can change the order of columns or remove columns. + +**To manage columns on the search results page** 1. On the search results page, select **Column Options**. -1. Select the row and drag it to a new position in the list. + The **Column Options** pane opens where you can view the columns that are currently displayed on the search results page. + + ![search-column-options.png](../../assets/images/key-terms-and-features/search-column-options.png) + +1. To reorder columns, select the row and drag it to a new position in the list. + +1. To remove a column, select the delete icon in the corresponding row. - You changes are immediately applied to the search results page. After you reorder the columns, close the **Column Options** pane. + Your changes are immediately applied to the search results page. After you reorder or delete the columns, close the **Column Options** pane. -### Sort records +### Sort search results -By default, the golden records on the search results page are sorted by relevance. This means that CluedIn prioritizes golden records that are most likely to match your search query closely. Sorting by relevance ensures that the most pertinent results are displayed at the top of the page, facilitating efficient data retrieval. +By default, the golden records on the search results page are sorted by relevance. This means that CluedIn prioritizes golden records that are most likely to match your search query closely. Sorting by relevance ensures that the most pertinent results are displayed at the top of the page. In addition to sorting by relevance, CluedIn provides two alternative sorting options: -- Sorting by new – choosing this option arranges golden records in descending order of their creation or modification date, with the most recently added or updated records appearing at the top. +- **Sorting by latest** – to arrange golden records in descending order of their creation or modification date, with the most recently added or updated records appearing at the top. -- Sorting by old – choosing this option arranges golden records in ascending order of their creation or modification date, with the oldest records appearing at the top. +- **Sorting by old** – to arrange golden records in ascending order of their creation or modification date, with the oldest records appearing at the top. -**To sort records** +**To sort the search results** - In the upper-right corner of the search results page, expand the sorting dropdown menu, and then select the needed sorting option. - ![search-1.png](../../assets/images/key-terms-and-features/search-1.png) + ![search-results-page-sorting.png](../../assets/images/key-terms-and-features/search-results-page-sorting.png) The new sorting is applied to the search results. @@ -82,43 +120,57 @@ In addition to sorting by relevance, CluedIn provides two alternative sorting op CluedIn provides two page view options: -- **Tile view** (a) – presents records in a visual grid-like format. In this view, records are arranged in rectangular tiles, each representing a specific record. +- **Tile view** – presents golden records in a visual grid-like format. In this view, golden records are arranged in rectangular tiles, each representing a specific golden record. + + ![search-tile-view.png](../../assets/images/key-terms-and-features/search-tile-view.png) + +- **Tabular view** – presents golden records in a structured table format. In this view, golden records are arranged in rows and columns, resembling a spreadsheet or database table. + + ![search-tabular-view.png](../../assets/images/key-terms-and-features/search-tabular-view.png) -- **Tabular view** (b) – presents records in a structured table format. In this view, records are arranged in rows and columns, resembling a spreadsheet or database table. +**To change the page view** -![search-2.png](../../assets/images/key-terms-and-features/search-2.png) +- In the upper-right corner of the search results page, select the needed page view option: **Tile view** (a) or **Tabular view** (b). -To change the page view, simply select the needed page view option. + ![search-results-page-view.png](../../assets/images/key-terms-and-features/search-results-page-view.png) + + The new view is applied to the search results page. ## Saved searches -Saved searches help you quickly retrieve a set of golden records that meet specific filter criteria. You can share the search with everybody else in the organization or just keep it to yourself. +Saved searches help you quickly retrieve a set of golden records that meet specific filter criteria. Once you define the filters and add the needed vocabulary keys to the search page, you can save the current search configuration for future use. Next time when you need to review a specific set of golden records and their vocabulary keys, you can quickly open a saved search instead of configuring the search from scratch. You can share the saved search with everybody else in the organization or just keep it to yourself. **To save a search** 1. In the upper-right corner of the search results page, select the save icon. + ![save-icon.png](../../assets/images/key-terms-and-features/save-icon.png) + 1. Enter the name of the search. 1. If you want to make this search available to everybody in your organization, turn on the toggle next to **Shared**. 1. Select **Save**. - ![search-3.png](../../assets/images/key-terms-and-features/search-3.png) + ![save-a-search.png](../../assets/images/key-terms-and-features/save-a-search.png) - The search is saved in CluedIn. Now, you can use it when you need to quickly find a specific set of records or when you want to [clean](/preparation/clean) those records. + The search is saved in CluedIn. Now, you can use it when you need to quickly find a specific set of golden records or when you want to [clean](/preparation/clean) those golden records. **To retrieve a saved search** -1. In the upper-right corner of the search results page, open the three-dot menu, and then select **Saved Searches**. +1. Do one of the following: + + - Click anywhere in the search box, and then use the toggle to define which saved searches you want to access: your private saved searches (**My Saved Searches**) or public saved searches (**Shared Saved Searches**). If you cannot find the needed saved search, select **View all saved searches**. + + - In the upper-right corner of the search results page, open the three-dot menu, and then select **Saved Searches**. - ![search-4.png](../../assets/images/key-terms-and-features/search-4.png) + ![saved-search.png](../../assets/images/key-terms-and-features/saved-search.png) - The **Saved Searches** pane opens, containing your own saved searches and shared saved searches. + The **Saved Searches** pane opens, containing your private saved searches and shared saved searches. 1. Find and select the needed saved search. - The golden records matching the saved search filters are displayed on the page. + The golden records matching the saved search filters are displayed on the search results page. ## Export search results @@ -134,20 +186,26 @@ After performing a search, you can export your results in one of the following f 1. In the upper-right corner of the search results page, open the three-dot menu, and then select **Export**. -1. In **Export name**, enter the name of the export file. + ![search-export-golden-records.png](../../assets/images/key-terms-and-features/search-export-golden-records.png) -1. In **Exporting format**, select the file format you want to use for exporting search results. +1. In **Export Name**, enter the name of the export file. + +1. In **Exporting Format**, select the file format you want to use for exporting search results. 1. In **Filters**, review the search filters that define which golden records will be exported. -1. In **Columns**, review the columns that will be exported. These columns are currently displayed on the search results page. If you want to change the columns, you can add, remove, or reorder the columns. +1. In **Columns**, review the columns that will be exported. These columns are currently displayed on the search results page. If you want to change the columns, you can add, remove, or reorder the columns as described in [Manage columns](#manage-columns). + + ![search-export-dialog.png](../../assets/images/key-terms-and-features/search-export-dialog.png) + +1. Select **Export**. After the file for export is prepared, you will receive a notification. -1. Select **Export**. After the file for export is prepared, you'll receive a notification. +1. In the notification, select **View**. The **Exported Files** page opens, where you can find the files available for download. -1. In the notification, select **View**. The **Task** > **Exported files** page opens, where you can find the files available for download. + ![search-exported-files-page.png](../../assets/images/key-terms-and-features/search-exported-files-page.png) - ![export-golden-records.gif](../../assets/images/key-terms-and-features/export-golden-records.gif) + To view the search filters that define golden records in a file, hover over the value in the **Filter Type** column. - You can view the search filters that define golden records in a file by hovering over the value in the **Filter type** column. +1. To download the file, select the download button next to the file name or the file name itself. -1. To download the file, select the download button or the file name. + The exported file is downloaded to your computer. diff --git a/docs/110-key-terms-and-features/040-filters.md b/docs/110-key-terms-and-features/040-filters.md index e227e18d..b5968406 100644 --- a/docs/110-key-terms-and-features/040-filters.md +++ b/docs/110-key-terms-and-features/040-filters.md @@ -56,7 +56,7 @@ When adding more rules to the filter, pay attention to the **AND**/**OR** operat Filters in search help you narrow down the exact records you want to see or use for such activities as merge or clean. There are two filter modes in search: -- **Basic** – you can filter records by entity types, providers, sources, or tags (if available in the system). You can select multiple values in each filter parameter. +- **Basic** – you can filter records by business domains, providers, sources, or tags (if available in the system). You can select multiple values in each filter parameter. ![filters-5.png](../../assets/images/key-terms-and-features/filters-5.png) @@ -82,11 +82,11 @@ This section includes descriptions of properties and operations available in fil | Created Date | Date when the record was created in the source system or date when the record was created via a manual data entry project in CluedIn. | | Description | Description of the record. | | Discovery Date | Date when the record was discovered in CluedIn during processing. | -| Display Name | Name of the record in CluedIn that is shown at the top of on the record details page next to the entity type. If the record in the source system does not contain the display name, then the Name is shown instead of Display Name. | +| Display Name | Name of the record in CluedIn that is shown at the top of on the record details page next to the business domain. If the record in the source system does not contain the display name, then the Name is shown instead of Display Name. | | Document Mime Type | A label that specifies the nature and format of the record. It is part of the metadata for records added via a crawler or posted to CluedIn. | | Encoding | A type of encoding scheme (e.g., UTF-8) used to represent the characters in the record. It is part of the metadata for records added via a crawler or posted to CluedIn. | | Entity Codes | Additional unique identifiers of the record. | -| Entity Type | An attribute of the record that corresponds to a specific business domain. You can set up a filter to return only those records that are associated with a particular entity type. | +| Business Domain | An attribute of the record that corresponds to a specific business domain. You can set up a filter to return only those records that are associated with a particular business domain. | | Last Changed By | User who was the last to edit the record. | | Last Processed Date | Date when the record was processed for the last time. | | Modified Date | Date when the record was modified in the source system or date when the record has been modified via clean, deduplication, or manual data entry project in CluedIn. | @@ -97,7 +97,7 @@ This section includes descriptions of properties and operations available in fil | Provider | A system from which the data came to CluedIn, such as SAP, HubSpot, or Azure Data Lake. You can set up a filter to return only those records that were created from a specific provider. | | Revision | A version number of the record. It is part of the metadata for records added via a crawler or posted to CluedIn. | | Source | A source of the incoming data, such as a specific file, endpoint, or database. You can set up a filter to return only those records that were created from a specific source. | -| Tag | A label that automatically categorizes records across entity types. You can set up a filter to return only those records that contain a specific tag. | +| Tag | A label that automatically categorizes records across business domains. You can set up a filter to return only those records that contain a specific tag. | ### Operations diff --git a/docs/110-key-terms-and-features/050-golden-records.md b/docs/110-key-terms-and-features/050-golden-records.md index fe209a56..ca422693 100644 --- a/docs/110-key-terms-and-features/050-golden-records.md +++ b/docs/110-key-terms-and-features/050-golden-records.md @@ -99,7 +99,7 @@ All in all, a data part is a record in a format that CluedIn can understand and ### Golden record (golden) -When you [process](/integration/process-data) the data set, CluedIn checks if the data part can be associated with the existing golden record. If the data part can be associated with the existing golden record—they share the same [codes](/key-terms-and-features/entity-codes)—then it is **aggregated to the existing golden record**. In this case, the golden record is re-processed. If the data part cannot be associated with the existing golden record, then a **new golden record is created**. +When you [process](/integration/process-data) the data set, CluedIn checks if the data part can be associated with the existing golden record. If the data part can be associated with the existing golden record—they share the same [identifiers](/key-terms-and-features/entity-codes)—then it is **aggregated to the existing golden record**. In this case, the golden record is re-processed. If the data part cannot be associated with the existing golden record, then a **new golden record is created**. ![golden-record-8.png](../../assets/images/key-terms-and-features/golden-record-8.png) diff --git a/docs/110-key-terms-and-features/060-billable-records.md b/docs/110-key-terms-and-features/060-billable-records.md index 16ce9cdb..d446230a 100644 --- a/docs/110-key-terms-and-features/060-billable-records.md +++ b/docs/110-key-terms-and-features/060-billable-records.md @@ -25,10 +25,10 @@ Not all data parts are considered billable records. The data parts that appeared {:.important} Every time a golden record is processed, the count of billable records is recalculated. If you remove records from CluedIn, the count of billable records would go down as it is recalculated. -Each record that you ingest and process from your source counts as a billable record. If 2 records come from the same source and share the same code ([entity origin code (primary identifier)](/key-terms-and-features/entity-codes) and [origin](/key-terms-and-features/origin)) then they are considered exact duplicates and are merged into one golden record. This is 1 billable record. +Each record that you ingest and process from your source counts as a billable record. If 2 records come from the same source and share the same identifier ([primary identifier](/key-terms-and-features/entity-codes) and [origin](/key-terms-and-features/origin)) then they are considered exact duplicates and are merged into one golden record. This is 1 billable record. If 2 similar records come from different sources and are identified as duplicates, they are also merged into 1 golden record, but they will be counted as 2 billable records. If 2 unique records come from the same or different sources, then 2 golden records are created. Therefore, there are 2 data parts, which means that there are 2 billable records. -When you ingest a record for the first time, a unique entity origin code (primary identifier) is created in the mapping to identify this record once processed. If you later change the value that was used to create the entity origin code (primary identifier) in the golden record and re-ingest the original record with the updated value, this will be considered as 2 data parts, resulting in 2 billable records. This is because CluedIn perceives the code that was initially created during the mapping as different from the updated record. \ No newline at end of file +When you ingest a record for the first time, a unique primary identifier is created in the mapping to identify this record once processed. If you later change the value that was used to create the primary identifier in the golden record and re-ingest the original record with the updated value, this will be considered as 2 data parts, resulting in 2 billable records. This is because CluedIn perceives the identifier that was initially created during the mapping as different from the updated record. \ No newline at end of file diff --git a/docs/110-key-terms-and-features/080-eventual-connectivity.md b/docs/110-key-terms-and-features/080-eventual-connectivity.md index de358c8c..0c4e219f 100644 --- a/docs/110-key-terms-and-features/080-eventual-connectivity.md +++ b/docs/110-key-terms-and-features/080-eventual-connectivity.md @@ -24,7 +24,7 @@ The goal of eventual connectivity is not to model your data into a final form bu **Underlying concepts** -- Codes – universally unique identifiers of a record. Read more about codes [here](/key-terms-and-features/entity-codes). +- Identifiers – universally unique identifiers of a record. Read more about identifiers [here](/key-terms-and-features/entity-codes). - Edges – a way to instruct CluedIn that there is a reference from one object to another. Edges are the key behind the eventual connectivity pattern. Read more about edge [here](/key-terms-and-features/edges). diff --git a/docs/110-key-terms-and-features/090-entity-type.md b/docs/110-key-terms-and-features/090-entity-type.md index e926bba6..d9d2bec8 100644 --- a/docs/110-key-terms-and-features/090-entity-type.md +++ b/docs/110-key-terms-and-features/090-entity-type.md @@ -1,6 +1,6 @@ --- layout: cluedin -title: Entity type (Business domain) +title: Business domain parent: Key terms and features nav_order: 9 has_children: false @@ -12,44 +12,44 @@ tags: ["development","entities","entity-types"] - TOC {:toc} -An entity type represents a specific business domain for data. It can signify physical objects, individuals, locations, and more. Entity types should be **global**, **timeless**, and **independent of specific data sources** (e.g., Contact, Organization, Car). +A business domain represents a specific business object that describes the semantic meaning of golden records. It can signify physical objects, individuals, locations, and more. Business domains should be **global**, **timeless**, and **independent of specific data sources** (e.g., Contact, Organization, Car). -Each golden record is associated with exactly one entity type. You can leverage built-in entity types or define your own custom entity types. +Each golden record is associated with exactly one business domain. You can leverage built-in business domains or define your own custom business domains. -An entity type is assigned to a clue during the mapping process, and it plays a critical role in various [codes](/key-terms-and-features/entity-codes), including entity origin codes and entity codes. +A business domain is assigned to a clue during the mapping process, and it plays a critical role in various [identifiers](/key-terms-and-features/entity-codes), including primary identifiers and additional identifiers. -## Entity type usage +## business domain usage ### Adding semantic context to golden records -The entity type provides **metadata**, such as a display name, icon, and description, adding a semantic layer to golden records. Golden records sharing the same entity type inherently share the same semantic meaning. +The business domain provides **metadata**, such as a display name, icon, and description, adding a semantic layer to golden records. Golden records sharing the same business domain inherently share the same semantic meaning. -When defining entity types, it’s essential to balance specificity and genericness. Use terminology familiar to your line of business (LOB) to help identify records intuitively. +When defining business domains, it’s essential to balance specificity and genericness. Use terminology familiar to your line of business (LOB) to help identify records intuitively. {:.important} -CluedIn is flexible—if you choose an initial entity type that needs adjustment, you can change it. However, changing entity types mid-project can be cumbersome, especially if deduplication projects, rules, or streaming configurations have already been applied. +CluedIn is flexible—if you choose an initial business domain that needs adjustment, you can change it. However, changing business domains mid-project can be cumbersome, especially if deduplication projects, rules, or streaming configurations have already been applied. ### Filtering golden records -Entity type acts as the default filter for many operations in CluedIn. Selecting the right entity type allows you to target groups of golden records that logically belong together. +Business domain acts as the default filter for many operations in CluedIn. Selecting the right business domain allows you to target groups of golden records that logically belong together. -### Producing entity origin code (primary identifier) +### Producing primary identifier -Entity type forms part of the primary identifier value. This structure enforces that records can only merge if they share the same entity type. +Business domain forms part of the primary identifier value. This structure enforces that records can only merge if they share the same business domain. -## Entity type properties and characteristics +## Business domain properties and characteristics -### Entity type code +### Business domain code -Entity types have a unique code, represented as a simple string prefixed by a slash (/). The code uniquely identifies the entity type. That is why you see a slash (/) in front of an entity type. +Business domains have a unique code, represented as a simple string prefixed by a slash (/). The code uniquely identifies the business domain. That is why you see a slash (/) in front of a business domain. -To create an entity type code, use concise, meaningful names and avoid non-alphanumeric characters where possible. +To create a business domain code, use concise, meaningful names and avoid non-alphanumeric characters where possible. -### Nested entity types +### Nested business domains -CluedIn supports nested entity types, allowing hierarchical organization of entity types. While not mandatory, this feature can help group entity types of the same nature. +CluedIn supports nested business domains, allowing hierarchical organization of business domains. While not mandatory, this feature can help group business domains of the same nature. -**Example of nested entity types** +**Example of nested business domains** In the following hierarchy, Video is a child of Document. @@ -60,12 +60,12 @@ In the following hierarchy, Video is a child of Document. /Document/Audio ``` -**Benefits of nested entity types** +**Benefits of nested business domains** -- **Filter grouped entities** – nested entity types allow you to filter or stream entities collectively. For example, using a filter that starts with /Document would include all documents, regardless of their specific sub-entity type. +- **Filter grouped entities** – nested business domains allow you to filter or stream entities collectively. For example, using a filter that starts with /Document would include all documents, regardless of their specific sub-business domain. -- **Streamline reporting** – nested types simplify reporting and analysis across related entity types. +- **Streamline reporting** – nested business domains simplify reporting and analysis across related business domains. ## Useful resources -- [Add or modify an entity type](/management/entity-type) \ No newline at end of file +- [Add or modify a business domain](/management/entity-type) \ No newline at end of file diff --git a/docs/110-key-terms-and-features/100-entity-codes.md b/docs/110-key-terms-and-features/100-entity-codes.md index 365c81bc..d70e545b 100644 --- a/docs/110-key-terms-and-features/100-entity-codes.md +++ b/docs/110-key-terms-and-features/100-entity-codes.md @@ -1,6 +1,6 @@ --- layout: cluedin -title: Entity codes (Identifiers) +title: Identifiers parent: Key terms and features nav_order: 10 has_children: false @@ -12,37 +12,37 @@ tags: ["development","entities","entity-codes"] - TOC {:toc} -A **code (identifier)** is a mechanism that CluedIn uses to define the **uniqueness** of a golden record. During processing, if two clues share the **same code**, they are **merged** into a single golden record. This ensures that data from different sources is unified under a consistent, unique identifier. +An **identifier** (previously known as **code**) is a mechanism that CluedIn uses to define the **uniqueness** of a golden record. During processing, if two clues share the **same identifier**, they are **merged** into a single golden record. This ensures that data from different sources is unified under a consistent, unique identifier. **Example** -Let’s explore the concept of codes (identifiers) in CluedIn through an example. We have a golden record—John Smith—that originates from the HR system. One of the codes for this golden record is created using the ID. Now, a new record from the CRM system appears in CluedIn, and one of its codes matches the code of the golden record from the HR system. As a result, the new CRM record is merged with the existing HR record, integrating any new properties from the CRM record into the existing golden record. +Let’s explore the concept of identifiers in CluedIn through an example. We have a golden record—John Smith—that originates from the HR system. One of the identifiers for this golden record is created using the ID. Now, a new record from the CRM system appears in CluedIn, and one of its identifiers matches the code of the golden record from the HR system. As a result, the new CRM record is merged with the existing HR record, integrating any new properties from the CRM record into the existing golden record. ![codes-merge-1.gif](../../assets/images/key-terms-and-features/codes-merge-1.gif) -To find all the codes that uniquely represent a golden record in the system, go to the golden record page, and select **View Codes**. +To find all the identifiers that uniquely represent a golden record in the system, go to the golden record page, and select **View Identifiers**. ![codes-1.gif](../../assets/images/key-terms-and-features/codes-1.gif) -The codes are divided into two sections: +The identifiers are divided into two sections: -- [Origin code](#entity-origin-code) – also referred to as the entity origin code. This is the **primary unique identifier** of a golden record in CluedIn. +- [Primary identifier](#primary-identifier) – this is the **primary unique identifier** of a golden record in CluedIn. -- [Codes](#codes) – also referred to as entity codes. This section contains all codes associated with a golden record. +- [Identifiers](#identifiers) – this section contains all identifiers associated with a golden record. -For more information, see the **Codes** section in our [Review mapping](/integration/review-mapping#codes) article. +For more information, see the **Identifiers** section in our [Review mapping](/integration/review-mapping#identifiers) article. -## Entity origin code (primary identifier) +## Primary identifier -An entity origin code is a **primary unique identifier** of a record in CluedIn. The required details for producing the entity origin code are established when the mapping for a data set is created. To find these details, go to the **Map** tab of the data set and select **Edit mapping**. On the **Map entity** tab, you'll find the **Entity Origin** section, which contains the required details for producing the entity origin code. +A primary identifier (previously knows as entity origin code) is a **primary unique identifier** of a record in CluedIn. The required details for producing the primary identifier are established when the mapping for a data set is created. To find these details, go to the **Map** tab of the data set and select **Edit mapping**. On the **Map entity** tab, you'll find the **Primary Identifier** section, which contains the required details for producing the primary identifier. ![codes-2.png](../../assets/images/key-terms-and-features/codes-2.png) -The entity origin code is made up from the [entity type](/key-terms-and-features/entity-type) (1), the [origin](/key-terms-and-features/origin) (2), and the value of the property that was selected for producing the entity origin code (3). This combination allows achieving absolute uniqueness across any data source that you interact with. +The primary identifier is made up from the [business domain](/key-terms-and-features/entity-type) (1), the [origin](/key-terms-and-features/origin) (2), and the value of the property that was selected for producing the primary identifier (3). This combination allows achieving absolute uniqueness across any data source that you interact with. ![codes-3.png](../../assets/images/key-terms-and-features/codes-3.png) -There might be cases when the property for producing the entity origin code is empty or you don't have any property suitable for defining uniqueness. In the following sections, we'll explore different ways for defining uniqueness. +There might be cases when the property for producing the primary identifier is empty or you don't have any property suitable for defining uniqueness. In the following sections, we'll explore different ways for defining uniqueness. ### Empty value in primary identifier @@ -68,13 +68,13 @@ Hash codes are case sensitive. So, with the same properties as in the example ab ### Auto-generated key in primary identifier -If your records do not have a property suitable for defining uniqueness, you can select the auto-generated option for producing the primary identifier in mapping. As a result, CluedIn will generate unique entity origin codes for the records using hash codes as documented above. +If your records do not have a property suitable for defining uniqueness, you can select the auto-generated option for producing the primary identifier in mapping. As a result, CluedIn will generate unique primary identifiers for the records using hash codes as documented above. **When not to use auto-generated keys?** -If you are using hash codes, remember that when the **record is changed**, the **value of the code will change** as well. It means that you should **avoid using auto-generated keys when you edit a data set**. +If you are using hash codes, remember that when the **record is changed**, the **value of the identifier will change** as well. It means that you should **avoid using auto-generated keys when you edit a data set**. -In CluedIn, we offer you the possibility to edit the source data. This is a great option as it can lead to much faster and better results in your golden records. However, if you use the **auto-generated key** to produce the entity origin code, each time you change the value, it will generate a different code. +In CluedIn, we offer you the possibility to edit the source data. This is a great option as it can lead to much faster and better results in your golden records. However, if you use the **auto-generated key** to produce the primary identifier, each time you change the value, it will generate a different identifier. For example, let's say you have some rules to capitalize firstName and lastName. Let's assume we have 2 records. @@ -99,7 +99,7 @@ If we add a rule to capitalize firstName and lastName, the records will be chang }] ``` -If you use the auto-generated key to produce the entity origin code, these two records will use the same hash code `e7c4d00573302d3b1432fd14d89e5dd0dc68a0ea`, so they will **merge**. However, **if you have already processed the data, it can lead to duplication**. Let's trace this process step-by-step. +If you use the auto-generated key to produce the primary identifier, these two records will use the same hash code `e7c4d00573302d3b1432fd14d89e5dd0dc68a0ea`, so they will **merge**. However, **if you have already processed the data, it can lead to duplication**. Let's trace this process step-by-step. 1. Upload the following JSON: @@ -114,7 +114,7 @@ If you use the auto-generated key to produce the entity origin code, these two r }] ``` -2. Map the data with **Auto-generated** key to produce the entity origin code. +2. Map the data with **Auto-generated** key to produce the primary identifier. 3. Process the data. @@ -124,7 +124,7 @@ If you use the auto-generated key to produce the entity origin code, these two r 6. Re-process the data. -As a result, you will have 2 golden records because you have changed the origin code of the golden record that had lowercase values. +As a result, you will have 2 golden records because you have changed the primary identifier of the golden record that had lowercase values. When you process the records for the first time in step 3, you send 2 different codes: @@ -141,11 +141,11 @@ When you process the records for the second time in step 6, you send the same co The records with codes 1, 3, and 4 will merge together. And the record with code 2 will remain as a separate golden record. {:.important} -If you want to edit your records in the source, do not use auto-generated key for producing entity origin code. +If you want to edit your records in the source, do not use auto-generated key for producing the primary identifier. ### Compound key (MDM code) in primary identifier -If you do not want to use an auto-generated key to produce the entity origin code, you can use a compound key. A compound key is built by concatenating different attributes to ensure uniqueness. It is commonly referred to as the MMDM code. +If you do not want to use an auto-generated key to produce the primary identifier, you can use a compound key. A compound key is built by concatenating different attributes to ensure uniqueness. It is commonly referred to as the MMDM code. For example, an MDM code can combine multiple attributes for a customer. @@ -158,7 +158,7 @@ For example, an MDM code can combine multiple attributes for a customer. - date of birth ``` -If you go with the MDM code, make sure you **normalize the values** by either creating a computed column for your data set or by adding a bit of glue code in using [advanced mapping](/integration/additional-operations-on-records/advanced-mapping-code). Our CluedIn experts can assist you with this task. Normalizing the MDM code is important because it will prevent scenarios where editing values changes the origin entity code, leading to undesired effects. +If you go with the MDM code, make sure you **normalize the values** by either creating a computed column for your data set or by adding a bit of glue code in using [advanced mapping](/integration/additional-operations-on-records/advanced-mapping-code). Our CluedIn experts can assist you with this task. Normalizing the MDM code is important because it will prevent scenarios where editing values changes the primary identifier, leading to undesired effects. ### Lack of options to define uniqueness @@ -170,45 +170,45 @@ If you have no way to define uniqueness for your records, generate a GUID using If this is your case, the only way is to fix the issue on the source level. You can modify the source of data to set up some kind of uniqueness. For example, if you have a SQL table, you can add a unique identifier for each row. -## Entity codes (Identifiers) +## Identifiers -An entity code is an additional identifier that uniquely represents a record in CluedIn. The required details for producing the entity codes are established when the mapping for a data set is created. To find these details, go to the **Map** tab of the data set and select **Edit mapping**. On the **Map entity** tab, you'll find the **Codes** section, which contains the required details for producing the entity codes. +In addition to primary identifier, identifiers (previously known as entity codes) can uniquely represents a record in CluedIn. The required details for producing the identifiers are established when the mapping for a data set is created. To find these details, go to the **Map** tab of the data set and select **Edit mapping**. On the **Map entity** tab, you'll find the **Identifiers** section, which contains the required details for producing the identifiers. ![codes-4.png](../../assets/images/key-terms-and-features/codes-4.png) -If a data set contains additional columns that can serve as unique identifiers besides the ones used for producing the entity origin code, then these columns can also be used to produce entity codes. For example, if the entity origin code is produced using the ID, then the entity code could be produced using the email. The entity codes are made up from the entity type, [origin](/key-terms-and-features/origin), and the value from the column that was selected for producing the entity codes. +If a data set contains additional columns that can serve as unique identifiers besides the ones used for producing the primary identifier, then these columns can also be used to produce additional identifiers. For example, if the primary identifier is produced using the ID, then the additional identifier could be produced using the email. The identifiers are made up from the [business domain](/key-terms-and-features/entity-type), [origin](/key-terms-and-features/origin), and the value from the column that was selected for producing the identifiers. -In the **Entity Codes** section, you can instruct CluedIn to produce additional codes: +In the **Identifiers** section, you can instruct CluedIn to produce additional codes: -- **Provider name codes** – codes that are built form the entity type, provider name (for example, File Data Source), and the value from the column that was selected for producing the entity origin code. +- **Provider name identifiers** – identifiers that are built form the business domain, provider name (for example, File Data Source), and the value from the column that was selected for producing the primary identifier. -- **Strict edge codes** – codes that are built from the entity type, data source group ID/data source ID/data set ID, and the value from the column that was selected for producing the entity origin code. +- **Strict edge identifiers** – identifiers that are built from the business domain, data source group ID/data source ID/data set ID, and the value from the column that was selected for producing the primary identifier. -**What happens if the value of a code is empty?** +**What happens if the value of am identifier is empty?** -The value will be ignored and no code will be added. A code is not a required element and using a hash code would be unnecessary as you have already defined uniqueness with the entity origin code. +The value will be ignored and no identifier will be added. An identifier is not a required element and using a hash code would be unnecessary as you have already defined uniqueness with the primary identifier. -**Is it bad if I have no codes defined?** +**Is it bad if I have no identifiers defined?** -No, it can happen regularly, generally when the source records cannot be trusted or are unknown. When in doubt, it is better not to add extra code and rely on deduplication projects to find duplicates. +No, it can happen regularly, generally when the source records cannot be trusted or are unknown. When in doubt, it is better not to add extra identifier and rely on deduplication projects to find duplicates. ## FAQ -**How to make sure that the codes will blend across different data sources?** +**How to make sure that the identifiers will blend across different data sources?** -Since a code will only merge with another code if they are identical, how can you merge records across different systems if the origin is different? One of the ways to achieve it is through the GUID. +Since an identifier will only merge with another identifier if they are identical, how can you merge records across different systems if the origin is different? One of the ways to achieve it is through the GUID. If a record has an identifier that is a GUID/UUID, you can set the origin as CluedIn because no matter the system, the identifier should be unique. However, this is not applicable if you are using deterministic GUIDS. If you're wondering whether you use deterministic GUIDs, conducting preliminary analysis on the data can help. Check if many GUIDs overlap in a certain sequence, such as the first chunk of the GUID being replicated many times. This is a strong indicator that you are using deterministic GUIDs. Random GUIDs are so unique that the chance of them being the same is close to impossible. -You could even determine that the entity type can be generic as well. You will have to craft these special entity codes in your clues (for example, something like `/Generic#CluedIn:`). You will need to make sure your edges support the same mechanism. In doing this, you are instructing CluedIn that no matter the entity type, no matter the origin of the data, this record can be uniquely identified by just the GUID. +You could even determine that the business domain can be generic as well. You will have to craft these special identifiers in your clues (for example, something like `/Generic#CluedIn:`). You will need to make sure your edges support the same mechanism. In doing this, you are instructing CluedIn that no matter the business domain, no matter the origin of the data, this record can be uniquely identified by just the GUID. -**What if a record doesn't have a unique reference to construct a code?** +**What if a record doesn't have a unique reference to construct an identifier?** -Often you will find that you need to merge or link records across systems that don't have IDs but rather require fuzzy merging to be able to link records. In this case, we often suggest creating a composite code constructed from a combination of column or property values that guarantee uniqueness. For example, if you have a Transaction record, you might find that a combination of the Transaction Date, Product, Location, and Store will guarantee uniqueness. It is best to calculate a "Hash" of these values combined, which means that we can calculate a code from this. +Often you will find that you need to merge or link records across systems that don't have IDs but rather require fuzzy merging to be able to link records. In this case, we often suggest creating a composite identifier constructed from a combination of column or property values that guarantee uniqueness. For example, if you have a Transaction record, you might find that a combination of the Transaction Date, Product, Location, and Store will guarantee uniqueness. It is best to calculate a "Hash" of these values combined, which means that we can calculate an identifier from this. -**What if an identifier is not ready for producing a code?** +**What if a key is not ready for producing an identifier?** -Sometimes identifiers for codes are not ready to be made into a unique entity origin code. For example, your data might include default or fallback values when a real value is not present. Imagine you have an EmployeeId column, and when a value is missing, placeholders like "NONE", "", or "N/A" are used. These are not valid identifiers for the EmployeeId. However, the important aspect is that you cannot handle all permutations of these placeholders upfront. Therefore, you should create codes with the intention that these values are unique. You can fix and clean up such values later. +Sometimes keys for identifiers are not ready to be made into a unique primary identifier. For example, your data might include default or fallback values when a real value is not present. Imagine you have an EmployeeId column, and when a value is missing, placeholders like "NONE", "", or "N/A" are used. These are not valid identifiers for the EmployeeId. However, the important aspect is that you cannot handle all permutations of these placeholders upfront. Therefore, you should create identifiers with the intention that these values are unique. You can fix and clean up such values later. ## Useful resources diff --git a/docs/110-key-terms-and-features/110-vocabularies.md b/docs/110-key-terms-and-features/110-vocabularies.md index 4b6d5c68..393a2cc9 100644 --- a/docs/110-key-terms-and-features/110-vocabularies.md +++ b/docs/110-key-terms-and-features/110-vocabularies.md @@ -60,14 +60,14 @@ Obviously, those 2 keys—`CRM.contact.email` and `ERP.contact.email`—represen By applying this principle, you can keep your lineage and have better flexibility and agility. However, it is up to you to decide when and if you want to map vocabulary keys with the same meaning to a shared vocabulary key or keep them separate. -### Entity type vs. vocabulary +### Business domain vs. vocabulary -When you map your data in CluedIn, you can map it to one entity type and one vocabulary. However, a vocabulary can be shared among different entity types and can represent only a partial aspect of the golden record. So, the following statements are true when describing a golden record: +When you map your data in CluedIn, you can map it to one business domain and one vocabulary. However, a vocabulary can be shared among different business domains and can represent only a partial aspect of the golden record. So, the following statements are true when describing a golden record: -- You can assign only **one entity type** to a golden record. +- You can assign only **one business domain** to a golden record. - You can use **multiple vocabularies** for a golden record. -By distinguishing between entity type and vocabulary, we decouple the value aspect of records from their modeling aspect. This approach provides greater flexibility in modeling and allows for evolution with changing use cases. +By distinguishing between business domain and vocabulary, we decouple the value aspect of records from their modeling aspect. This approach provides greater flexibility in modeling and allows for evolution with changing use cases. ## Core vocabularies @@ -80,7 +80,7 @@ The role of core vocabularies is to merge records from different systems based o Although you can add your own core vocabularies, we suggest that you do not. The reason is mainly due to upgrade support and making sure your upgrades are as seamless and automated as possible. -Core vocabularies do not include the namespace for a source system. It will typically have the entity type, the name of the key - or if it is a nested key like and address, it will have the nesting shown in the key name: +Core vocabularies do not include the namespace for a source system. It will typically have the business domain, the name of the key - or if it is a nested key like and address, it will have the nesting shown in the key name: - organization.industry - organization.address.city diff --git a/docs/110-key-terms-and-features/120-origin.md b/docs/110-key-terms-and-features/120-origin.md index 2ab33e77..c9199d12 100644 --- a/docs/110-key-terms-and-features/120-origin.md +++ b/docs/110-key-terms-and-features/120-origin.md @@ -14,25 +14,25 @@ tags: ["development","clues"] Generally, the origin determines the **source of a golden record**. So, when you map your data, the origin will be automatically set to the name of the data source, for example, `MicrosoftDynamics`, `Oracle`, `Hubspot`, or `MsSQLDatabase5651651`. However, you can change the origin during mapping if needed. -As mentioned in our [Entity codes](/key-terms-and-features/entity-codes) reference article, the origin is used in the entity origin code (primary identifier) and the codes (identifiers). +As mentioned in our [Identifiers](/key-terms-and-features/entity-codes) reference article, the origin is used in the primary identifier and the identifiers. ![entity-origin-code.png](../../assets/images/key-terms-and-features/entity-origin-code.png) In this article, we'll explain the usage of the origin in two important processes in CluedIn: -- [Merging records by codes](#merging-records-by-codes) +- [Merging records by identifiers](#merging-records-by-identifiers) - [Linking golden records](#linking-golden-records) -## Merging records by codes +## Merging records by identifiers -Since the origin is used in the entity origin code (primary identifier) and the codes (identifiers), it plays a role in merging—when 2 codes are identical, the records will merge together. +Since the origin is used in the primary identifier and the identifiers, it plays a role in merging—when 2 identifiers are identical, the records will merge together. -To understand the role of origin in merging, suppose you have an attribute that you can _safely rely on to merge records across source systems_. Let's say this attribute is a `SerialNumber` that is used in your CRM, ERP, and Support systems. As the serial number is unique and cross-system, you can use it to merge together _all golden records that have the same serial number_. Of course, you can achieve this using our UI; however, there is a faster way to do this via merging by codes. +To understand the role of origin in merging, suppose you have an attribute that you can _safely rely on to merge records across source systems_. Let's say this attribute is a `SerialNumber` that is used in your CRM, ERP, and Support systems. As the serial number is unique and cross-system, you can use it to merge together _all golden records that have the same serial number_. Of course, you can achieve this using our UI; however, there is a faster way to do this via merging by identifiers. -Let's consider the example of three records, each coming from a different source system—CRM, ERP, and Support. For each data source, we select the `Serial Number` to produce the entity origin code. The following table shows the codes that will be produced by default. +Let's consider the example of three records, each coming from a different source system—CRM, ERP, and Support. For each data source, we select the `Serial Number` to produce the primary identifier. The following table shows the identifiers that will be produced by default. -| Source | Entity Type | Origin | Entity origin code | +| Source | Business domain | Origin | Primary identifier | |--|--|--|--| | CRM | Product | crm | `/Product#crm:[SERIAL NUMBER VALUE]` | | ERP | Product | erp | `/Product#erp:[SERIAL NUMBER VALUE]` | @@ -42,9 +42,9 @@ Even if the serial number is the same, the records will not merge together as th ![merging-by-codes-1.png](../../assets/images/key-terms-and-features/merging-by-codes-1.png) -So, how would you use the serial number to merge records together? The answer is by producing a **code that shares the same origin**, for example, `PRODUCT-SERIALNUMBER`. As a result, the code for each record will share the same entity type, origin, and the value of serial number as shown in the following table. +So, how would you use the serial number to merge records together? The answer is by producing an **identifier that shares the same origin**, for example, `PRODUCT-SERIALNUMBER`. As a result, the identifier for each record will share the same business domain, origin, and the value of serial number as shown in the following table. -| Source | Entity type | Origin | Entity origin code | +| Source | Business domain | Origin | Primary identifier | |--|--|--|--| | CRM | Product | PRODUCT-SERIALNUMBER | `/Product#PRODUCT-SERIALNUMBER:[SERIAL NUMBER VALUE]` | | ERP | Product | PRODUCT-SERIALNUMBER | `/Product#PRODUCT-SERIALNUMBER:[SERIAL NUMBER VALUE]` | @@ -56,24 +56,24 @@ Since the origin is shared among different sources, each time the same serial nu ## Linking golden records -Origin can be used to link golden records together to **create relationship**. You can link golden records using codes, rules, or manually in the UI. To create a relationship using codes, you need to know the **origin of target golden records**. These are the golden records to which you want to link current records. +Origin can be used to link golden records together to **create relationship**. You can link golden records using identifiers, rules, or manually in the UI. To create a relationship using identifiers, you need to know the **origin of target golden records**. These are the golden records to which you want to link current records. Suppose you have Contact records that contain the `companyID` property, and you know that you have Company records with this `ID`. To establish a link between Contact and Company, you need to define the "to" relationship by setting up the following: -- Entity Type: `/Company` +- Business domain: `/Company` - Origin: `[ORIGIN-OF-COMPANY-RECORDS]` - Value: `Company ID` -The **combination of those 3 values** needs to **match one of the codes of target records**. +The **combination of those 3 values** needs to **match one of the identifiers of target records**. ![linking-golden-records.png](../../assets/images/key-terms-and-features/linking-golden-records.png) -To make the process of linking golden records easier, you can use the recommendation for defining the origin that we provided in [Merging by codes](#merging-records-by-codes). Essentially, the method of **shared origin** that you use for merging by codes can also be used to facilitate the process of linking golden records. This way you do not have to rely on the source system and instead use the origin that you defined for related data. +To make the process of linking golden records easier, you can use the recommendation for defining the origin that we provided in [Merging by identifiers](#merging-records-by-identifiers). Essentially, the method of **shared origin** that you use for merging by identifiers can also be used to facilitate the process of linking golden records. This way you do not have to rely on the source system and instead use the origin that you defined for related data. ## Useful resources -- [Entity type](/key-terms-and-features/entity-type) +- [Business domain](/key-terms-and-features/entity-type) -- [Entity codes (Identifiers)](/key-terms-and-features/entity-codes) +- [Identifiers](/key-terms-and-features/entity-codes) - [Review mapping](/integration/review-mapping) \ No newline at end of file diff --git a/docs/130-workflow.md/030-create-and-manage-workflows.md b/docs/130-workflow.md/030-create-and-manage-workflows.md index ac9d4f80..0ad5e775 100644 --- a/docs/130-workflow.md/030-create-and-manage-workflows.md +++ b/docs/130-workflow.md/030-create-and-manage-workflows.md @@ -33,13 +33,13 @@ Currently, you can automate the approval process for certain actions in CluedIn. | Editing a property in the record** | Batched Clues Approval | If somebody tries to edit a property in the record, an approval request is sent to the users with the same or higher claim access level. | Rule creation* | Processing Rule Change Approval | If somebody tries to create a new rule, an approval request is sent to the users with the same or higher claim access level. | | Rule modification | Processing Rule Change Approval | If somebody tries to make changes to the rule, an approval request is sent to the owners of the rule. | -| Entity type creation* | Entity Type Change Approval | If somebody tries to create a new entity type, an approval request is sent to the users with the same or higher claim access level. | +| Business domain creation* | Entity Type Change Approval | If somebody tries to create a new business domain, an approval request is sent to the users with the same or higher claim access level. | | Receiving internal CluedIn notifications | Notification | If somebody receives an internal notification in CluedIn, the same notification is sent to the external systems such as Outlook or the Approval app in Teams. | | Inviting a new user to CluedIn | User Invite Approval | If somebody tries to add a new user to CluedIn, an approval request is sent to the users with the same or higher claim access level. | _* The approval requests for these actions are sent only if the **Approvals for creating items** option is enabled in [workflow settings](/microsoft-integration/power-automate/configuration-guides)._ -_* *The approval requests for this action are sent only if the [entity type](/management/entity-type) of the records has the **Batch approval workflow** option enabled._ +_* *The approval requests for this action are sent only if the [business domain](/management/entity-type) of the records has the **Batch approval workflow** option enabled._ {:.important} One action can be used only in one workflow. For example, if you created a workflow with the Vocabulary Change Approval action, you can't select this action in another workflow. diff --git a/docs/150-consume/010-graphql.md b/docs/150-consume/010-graphql.md index 4ffb340d..1b510107 100644 --- a/docs/150-consume/010-graphql.md +++ b/docs/150-consume/010-graphql.md @@ -11,7 +11,7 @@ last_modified: 2021-10-08 CluedIn provides GraphQL as its way to pull and query data from it. The CluedIn GraphQL endpoint uses a combination of the different datastores to service the result of the query in question. -You might find that a particular GraphQL query uses the Search, Graph and Blob Datastore to render the results. This is due to the query optimiser of CluedIn that determines the right datastore to serve the different parts of your query. This also allows immense flexibility with querying the data. An example would be that if we wanted to find all entities that are of a specific Entity Type and have a particular value for a property then you will find that the Search Store will service both these parts of the query and hence CluedIn will only ask it to service the query. If you then ask it to run this query, but return the full history of the records then CluedIn will run the search against the Search Store, but then using the results from the Search it will then ask the Blob Store to fetch the full object history out if it. Likewise, if you asked it to also return the records that are connected to these results of type Person, then it will most likely ask the Graph Store to fulfil that part of the query. +You might find that a particular GraphQL query uses the Search, Graph and Blob Datastore to render the results. This is due to the query optimiser of CluedIn that determines the right datastore to serve the different parts of your query. This also allows immense flexibility with querying the data. An example would be that if we wanted to find all entities that are of a specific business domain and have a particular value for a property then you will find that the Search Store will service both these parts of the query and hence CluedIn will only ask it to service the query. If you then ask it to run this query, but return the full history of the records then CluedIn will run the search against the Search Store, but then using the results from the Search it will then ask the Blob Store to fetch the full object history out if it. Likewise, if you asked it to also return the records that are connected to these results of type Person, then it will most likely ask the Graph Store to fulfil that part of the query. ![image](../assets/images/consume/simple-graphql-example.png) diff --git a/docs/150-consume/graphql/020-add-graphql-entity-type-resolvers.md b/docs/150-consume/graphql/020-add-graphql-entity-type-resolvers.md index 0b8f9405..c146a1cf 100644 --- a/docs/150-consume/graphql/020-add-graphql-entity-type-resolvers.md +++ b/docs/150-consume/graphql/020-add-graphql-entity-type-resolvers.md @@ -8,7 +8,7 @@ title: Add GraphQL entity type resolvers tags: ["consume","graphql"] --- -You can add your own specific resolvers to fetch data given filters such as what entity type a record is. +You can add your own specific resolvers to fetch data given filters such as what business domain a record is. Here is an example of how to return Calendar Events in a different way through the GraphQL endpoints. diff --git a/docs/150-consume/streams/040-stream-logs.md b/docs/150-consume/streams/040-stream-logs.md new file mode 100644 index 00000000..ba1bfc2b --- /dev/null +++ b/docs/150-consume/streams/040-stream-logs.md @@ -0,0 +1,60 @@ +--- +layout: cluedin +nav_order: 5 +parent: Streams +grand_parent: Consume +permalink: /consume/streams/stream-logs +title: Stream logs +tags: ["consume", "data export", "streams"] +last_modified: 2025-04-18 +--- + +In this article, you will learn how to ensure that your golden records have been successfully exported and how to identify issues when something goes wrong. + +
+ +
+ +CluedIn offers three types of logs to assist in monitoring and troubleshooting your streams: + +- **Golden record stream logs** – view the streams that exported a specific golden record. + +- **Stream logs** – view golden records exported by a specific stream. + +- **Export target health checks** – view the health check status of the export target, updated every minute. + +**Golden record stream logs** + +Each golden record contains the **Streams** tab, where you can find the stream that exported the golden record and the date it was sent to the export target. + +![golden-record-streams.png](../../assets/images/consume/streams/golden-record-streams.png) + +This page lists active and paused streams that exported the golden record. However, if the stream is stopped, it is not displayed on the **Streams** tab because all of its logs are cleared. You can select the stream to view its details. + +**Stream logs** + +Each stream contains the **Stream Log** tab, where you’ll find all golden records that were exported by this stream. If the stream encounters an error while exporting golden records, it will be displayed on the page. This way, you can quickly identify and address any issues. + +![stream-log.png](../../assets/images/consume/streams/stream-log.png) + +You can filter the stream logs by two categories: + +- **Severity** – the level of importance or urgency of an event or issue. Learn about standard severity levels [here](https://learn.microsoft.com/en-us/dotnet/core/extensions/logging?tabs=command-line#log-level). + +- **Area** – the place in the streaming pipeline that produces logs: + + - **Stream ingestion log** – logs coming from the stream. + + - **Health check** – logs coming from the export target assigned to the stream. When the health check status of the export target changes, for example, from **Healthy** to **Unhealthy**, a new log is added to the page. + + - **Export target** – other messages that the export target itself would like to log in the stream. + +This page displays logs only when the stream is active or paused. If you stop the stream, the logs are cleared. + +**Export target health checks** + +Each export target contains the **Health Checks** tab, where you can find the health check status of the export target. + +![export-target-health-check.png](../../assets/images/consume/streams/export-target-health-check.png) + +CluedIn runs health checks for the export target every 60 seconds. If the export target encounters any issues, its health status will be marked as **Unhealthy**. If the export target becomes **Unhealthy**, the stream associated with that export target is stopped and the corresponding log is added to the **Stream Log** page as well. diff --git a/docs/160-release/002-release-2025-01.md b/docs/160-release/002-release-2025-01.md new file mode 100644 index 00000000..7d7999d0 --- /dev/null +++ b/docs/160-release/002-release-2025-01.md @@ -0,0 +1,77 @@ +--- +layout: cluedin +title: Release 2025.01 +parent: Release overview +nav_order: 2 +permalink: /release-notes/2025-01 +--- +## On this page +{: .no_toc .text-delta } +- TOC +{:toc} + +This article outlines new features and improvements in CluedIn 2025.01. + +
+ +
+ +The following sections contain brief description of new features and links to related articles. + +## New Ingestion dashboard + +The new Ingestion dashboard is designed to simplify your work with data sources and make data management more efficient and user-friendly. You’ll find a quick and simple way to start uploading your data into CluedIn from files, ingestion endpoints, databases, manual data entry projects, and crawlers. Additionally, you'll be able to quickly identify which data sets or manual data entry projects require attention, enabling you to prioritize and address issues efficiently. For more information, see [Ingestion](/integration). + +## Sources page update + +The Sources page (previously, Data Sources) now displays the number of sources per type and allows filtering the sources by type. Now, you can quickly find the needed source without expanding the group. + +## Monitoring page update + +The Monitoring page for data sets created from ingestion endpoints now displays the ingestion progress over time in the form of hourly and daily ingestion reports. Additionally, we’ve improved the endpoint index to help you track each payload by receipt ID. This way, you can review all records sent to CluedIn in a specific request. You can also view the golden records produced from each payload and delete those golden records if you no longer need them. We've also added a list of potential errors that can occur with the data set, along with remediation steps and the status for each error. For more information, see [Monitoring for ingestion endpoints](/integration/additional-operations-on-records/monitoring#monitoring-for-ingestion-endpoints). + +## Source record approval + +Source record approval is a mechanism that ensures only verified records are sent for processing. This is particularly useful for data sets created via an ingestion endpoint. After completing the initial full data load, you may want to ingest only delta records on a daily basis. The approval mechanism helps ensure that only verified delta records are processed. Data source owners can review these delta records and decide whether they should be processed. Source record approval can also be beneficial for manual data entry projects, as it grants project owners full control over the records created. Project owners can review new records added by non-owner users and decide whether they should be processed and turned into golden records. For more information, see [Approval](/integration/additional-operations-on-records/approval). + +## Data set validations + +Data set validations allow you to check source records for errors, inconsistencies, and missing values, and to correct invalid values. You can use auto-validation, where CluedIn analyzes the fields and suggest appropriate validation methods, or configure your own validation methods. By using data set validations, you can enhance the quality of source records and prevent incorrect records from becoming golden records. For more information, see [Validations](/integration/additional-operations-on-records/validations). + +## Power Fx formulas + +You can now use Power Fx formulas in rules to set up filters, conditions, and actions. With the help of Excel-like formulas, you can perform querying, equality testing, decision making, type conversion, and string manipulation based on the supported properties of a data part or a golden record. For more information, see [Power Fx formulas in rules](/management/rules/power-fx-formulas). + +## New access control actions + +Previously, you could only use the Allow Access action in the policy rule to give access to all or specific vocabulary keys in golden records. Now, you have more granular control over the types of access to golden records and their properties: + +- View – to allow view-only access to all or specific vocabulary keys in golden records. +- Mask – to restrict access to sensitive data, allowing certain users or roles to know that the value exists but remains hidden. +- Add/edit – to grant full control over specific properties in golden records, allowing certain users or roles to add new properties to the golden record or edit existing properties. + +For more information, see [Access control](/management/access-control). + +## Search experience update + +You can quickly find your recent searches, private saved searches, and shared saved searches by clicking anywhere in the search box. We’ve also improved the selection of business domains for search—now, you can view the number of golden records per business domain. Additionally, to simplify the process of adding columns to the search results page, we've updated the vocabulary key selector. Now, the vocabulary keys are grouped by vocabularies, allowing you to conveniently add all or specific vocabulary keys. For more information, see [Search](/key-terms-and-features/search). + +## Audit log actions update + +We’ve expanded the list of audit log actions to help you track changes to [clean projects](/preparation/clean/clean-reference#clean-project-audit-log-actions), [deduplication projects](/management/deduplication/deduplication-reference#deduplication-project-audit-log-actions), [users](/administration/user-management#user-management-reference), and [vocabularies](/management/data-catalog/vocabulary#ocabulary-overview). Additionally, the audit log now displays the activation and deactivation of enrichers, crawlers, and export targets, as well as the addition or removal of permissions for these items. This enhancement provides greater visibility and control, ensuring you can effectively audit and manage all activities. + +## Stream logs + +We have added stream logs to help you verify the successful export of your golden records and identify issues if something goes wrong. On the golden record page, you can now find a list of streams that exported the golden record, along with the date it was sent to the export target. Additionally, each stream now includes a stream log, where you’ll find all golden records exported by that stream. Stream logs can help you effectively monitor and troubleshoot your streams to ensure everything runs smoothly. For more information, see [Stream logs](/consume/streams/stream-logs). + +## Terminology changes + +To simplify CluedIn interface and make it more intuitive and better aligned with common industry concepts, we have changed some terms used in the platform. There are three major changes: + +- **Entity type** is now referred to as **Business domain**. + +- **Entity origin code** is now referred to as **Primary identifier**. + +- **Entity codes** are now referred to as **Identifiers**. + +The functionality behind the terms remains the same; only the terms have been changed. Learn more in [Terminology changes](/release-notes/terminology-changes). diff --git a/docs/160-release/003-terminology-changes.md b/docs/160-release/003-terminology-changes.md new file mode 100644 index 00000000..754e3e14 --- /dev/null +++ b/docs/160-release/003-terminology-changes.md @@ -0,0 +1,22 @@ +--- +layout: cluedin +title: Terminology changes +parent: Release overview +nav_order: 3 +permalink: /release-notes/terminology-changes +last_modified: 2025-04-02 +--- + +In the [2025.01 release](/release-notes/2025-01) of CluedIn, we have updated some of the terminology used in the platform. The terminology changes simplify CluedIn interface, making it more intuitive and better aligned with common industry concepts. + +
+ +
+ +The terminology changes only affect the terms, the functionality behind the terms remains the same. The following table provides a summary of terminology changes in CluedIn. + +| Old term | New term | Definition | +|--|--|--| +| Entity type | Business domain | A well-known business object that provides context for golden records. In CluedIn, all golden records must have a business domain to ensure systematic organization of data management processes. Learn more in [Business domain](/key-terms-and-features/entity-type). | +| Entity origin code | Primary identifier | A mechanism used in CluedIn to define the uniqueness of a golden record. It is configured during the mapping process. The primary identifier consists of a business domain, an origin, and a key that uniquely identifies the record. This combination allows you to achieve absolute uniqueness across all the data sources you interact with. Learn more in [Identifiers](/key-terms-and-features/entity-codes). | +| Entity codes | Identifiers | Additional mechanism to define the uniqueness of a golden record. If a data set contains additional columns that can uniquely represent a record, these columns can also be used to generate identifiers. Learn more in [Identifiers](/key-terms-and-features/entity-codes). | \ No newline at end of file diff --git a/docs/180-microsoft-integration/010-powerapps.md b/docs/180-microsoft-integration/010-powerapps.md index 7a43fb75..91dc0704 100644 --- a/docs/180-microsoft-integration/010-powerapps.md +++ b/docs/180-microsoft-integration/010-powerapps.md @@ -16,7 +16,7 @@ Power Apps can be integrated with CluedIn to enable you to **manage your master Power Apps integration offers the following benefits: -- 2-way synchronization of Dataverse metadata to CluedIn entity types and vocabularies and vice versa: +- 2-way synchronization of Dataverse metadata to CluedIn business domains and vocabularies and vice versa: - CluedIn stream to export golden records from CluedIn to the Dataverse tables. diff --git a/docs/180-microsoft-integration/050-event-hub.md b/docs/180-microsoft-integration/050-event-hub.md index 4022226b..3fa86a3e 100644 --- a/docs/180-microsoft-integration/050-event-hub.md +++ b/docs/180-microsoft-integration/050-event-hub.md @@ -10,7 +10,7 @@ Azure Event Hub integration enables the transmission of workflow events from Clu - Adding a data source, updating a data source -- Adding an entity type +- Adding a business domain - Adding a role diff --git a/docs/180-microsoft-integration/090-excel-add-in.md b/docs/180-microsoft-integration/090-excel-add-in.md index 041aa89a..fff20d3a 100644 --- a/docs/180-microsoft-integration/090-excel-add-in.md +++ b/docs/180-microsoft-integration/090-excel-add-in.md @@ -94,7 +94,7 @@ When the CluedIn Excel Add-in is added, a new group called **CluedIn** appears o - **Show Taskpane** – opens the CluedIn Excel Add-in task pane to the right side of the window. -- **Create Entity Type** – opens the entity type creation pane in CluedIn in your default browser. For more information, see [Create entity type](/management/entity-type#create-an-entity-type). +- **Create Entity Type** – opens the business domain (previously entity type) creation pane in CluedIn in your default browser. For more information, see [Create business domain](/management/entity-type#create-a-business-domaine). - **Merge Entity** – initiates the merging process by opening the merging page in CluedIn in your default browser. @@ -116,9 +116,9 @@ Once you connect to an instance of CluedIn, you can start working with the data ### Load data from CluedIn to Excel -1. Select the **Entity Type** of golden records that you want to load. The dropdown list contains the entity types that are currently used in CluedIn. +1. In **Entity Type**, select the business domain (previously entity type) of golden records that you want to load. The dropdown list contains the business domains that are currently used in CluedIn. -1. Specify the **Vocabulary Keys** of golden records that you want to load. To add all vocabulary keys associated with the selected entity type, use the **Auto-select** option. +1. Specify the **Vocabulary Keys** of golden records that you want to load. To add all vocabulary keys associated with the selected business domain, use the **Auto-select** option. ![load-data-auto-select.png](../../assets/images/microsoft-integration/excel-add-in/load-data-auto-select.png) @@ -148,7 +148,7 @@ Once you connect to an instance of CluedIn, you can start working with the data When the data is loaded, it becomes available in the spreadsheet, and you can start [modifying](#modify-loaded-data-in-excel) it as needed. By default, the rows are presented in alternating light blue and white colors. - Note that the sheet name corresponds to the entity type of loaded golden records. If you want to load the data of another entity type, just add a new sheet, edit the configuration on the **Load Data** tab, and load the data. and You can have as many sheets as you like. + Note that the sheet name corresponds to the business domain of loaded golden records. If you want to load the data of another business domain, just add a new sheet, edit the configuration on the **Load Data** tab, and load the data. and You can have as many sheets as you like. ![loaded-data.png](../../assets/images/microsoft-integration/excel-add-in/loaded-data.png) diff --git a/docs/180-microsoft-integration/copilot/020-work-with-copilot.md b/docs/180-microsoft-integration/copilot/020-work-with-copilot.md index 7aabdfc5..78c2afd5 100644 --- a/docs/180-microsoft-integration/copilot/020-work-with-copilot.md +++ b/docs/180-microsoft-integration/copilot/020-work-with-copilot.md @@ -71,7 +71,7 @@ CluedIn Copilot can analyze a data set to provide general overview, suggest poss |--|--|--| | DescribeDataSet | Provides general information about a data set: column description, possible validation checks, data quality issues, and so on.

If you are on the data set page, you can just tell the Copilot to describe _this_ data set. Otherwise, you can refer to the data set by its ID, which you can find in the URL of the page. | Tell me a bit about this data set.

Describe this data set.

Describe the data set with ID 443259BB-1D17-4078-A069-7ECAD418BA19. | | SuggestDatasetMapping | Provides suggestions on how to map a data set to an existing entity type and vocabulary. The suggested mapping can be used to define how the data set columns should be transformed and linked to the specified vocabulary.

If you are on the data set page, you can just tell the Copilot to suggest mapping for _this_ data set. Otherwise, you can refer to the data set by its ID, which you can find in the URL of the page. | Can you suggest a mapping from this data set to the Employee vocabulary?

Can you suggest how to map this data set to the Company vocabulary? | -| CreateDatasetMapping | Create a mapping from a data set to an existing entity type and vocabulary. Note that you'll need to set up the entity origin code to complete the mapping.

If you are on the data set page, you can just tell the Copilot to create mapping for _this_ data set. Otherwise, you can refer to the data set by its ID, which you can find in the URL of the page. | Can you create a mapping from this data set to the Employee vocabulary? | +| CreateDatasetMapping | Create a mapping from a data set to an existing business domain and vocabulary. Note that you'll need to set up the primary identifier to complete the mapping.

If you are on the data set page, you can just tell the Copilot to create mapping for _this_ data set. Otherwise, you can refer to the data set by its ID, which you can find in the URL of the page. | Can you create a mapping from this data set to the Employee vocabulary? | | ListDataSets | Provides a list of all available data sets. Note that it is not possible to list data sets by creation date or other properties, you can only get a list of all data sets. | Can you list all data sets? | | EntitySearchByDataSetColumnSample| Allows you to check if the values you have chosen as an entity code can already be found in the system. This can be helpful when you want to ensure that the chosen entity codes are unique and do not already exist in the system. | Can you check if values in the customerId column already have values in the system? | @@ -100,7 +100,7 @@ CluedIn Copilot can create vocabularies and vocabulary keys as well detect anoma | Copilot function | Description | Prompt example | |--|--|--| -| CreateVocabulary | Creates a new vocabulary with the specified name. Once a new vocabulary is created, you'll get a link to view and manage the vocabulary details.

If you are on the page of the entity type you want to associate the vocabulary with, you can just tell the Copilot to create the vocabulary for _this_ entity type. If you don't specify the entity type, it will be provided automatically, but you can change it later. | Can you create a new vocabulary called Company for the Company entity type?

Can you create a new vocabulary called Company for this entity type? | +| CreateVocabulary | Creates a new vocabulary with the specified name. Once a new vocabulary is created, you'll get a link to view and manage the vocabulary details.

If you are on the page of the business domain you want to associate the vocabulary with, you can just tell the Copilot to create the vocabulary for _this_ business domain. If you don't specify the business domain, it will be provided automatically, but you can change it later. | Can you create a new vocabulary called Company for the Company business domain?

Can you create a new vocabulary called Company for this business domain? | | CreateVocabularyKey | Creates a new vocabulary key with the specified name. If you previously created a vocabulary in the chat, the new vocabulary keys will be added to that vocabulary.

You can create individual vocabulary keys one at a time by entering a separate prompt for each vocabulary key. However, if you need to create multiple vocabulary keys, you can instruct Copilot to perform the task in a single prompt. | Can you create 10 vocabulary keys including Name, Age, Gender, JobTitle, ContactNumber, Email, ManagedBy, Salary, Tenure and NickName? | | ProfileVocabularyKey | Creates profiling for a vocabulary key. | Can you profile this vocabulary key? | | StandardizeData | Provides suggestions on how to standardize or normalize values within a vocabulary key. You can review the suggestions and then instruct Copilot to create rules or do it on your own. | Can you standardize values of this vocabulary key? | @@ -140,7 +140,7 @@ CluedIn Copilot can create clean projects according to your requirements and dis | Copilot function | Description | Prompt example | |--|--|--| -| CreateCleanProject | Creates a clean project. You'll get a brief project description, including top 10 records that match the project's filters. You can click the link to go to the clean project and validate if Copilot did the right thing. Then you'll be able to generate the project results on your own. | Can you create a clean project to fix contact.jobTitle values in records of the Contact entity type? | +| CreateCleanProject | Creates a clean project. You'll get a brief project description, including top 10 records that match the project's filters. You can click the link to go to the clean project and validate if Copilot did the right thing. Then you'll be able to generate the project results on your own. | Can you create a clean project to fix contact.jobTitle values in records of the Contact business domain? | | ListCleaningProjects | Provides a list of all available clean projects, including project ID and project name. | Can you list all clean projects?

What clean project are currently available in the platform? | ### Hierarchy skills @@ -149,7 +149,7 @@ CluedIn Copilot can create hierarchies to visualize relations between golden rec | Copilot function | Description | Prompt example | |--|--|--| -| CreateHierarchy | Creates a new hierarchy. Before creating the hierarchy, make sure that the records have the appropriate edge type defined. Once the hierarchy is created, you'll get a link to view the details. | Can you create a hierarchy called Org Chart for all records of the Contact entity type? | +| CreateHierarchy | Creates a new hierarchy. Before creating the hierarchy, make sure that the records have the appropriate edge type defined. Once the hierarchy is created, you'll get a link to view the details. | Can you create a hierarchy called Org Chart for all records of the Contact business domain? | | ListHierarchies | Provides a list of all available hierarchies, including status, number of nodes, creation and modification dates. | Can you list all hierarchies that are available in the system? | ### Glossary skills @@ -163,7 +163,7 @@ CluedIn Copilot can create glossary terms within specific category and display i ### Other skills -CluedIn Copilot can search for golden records according to your requirements as well as perform actions related to entity types (create, describe, list). +CluedIn Copilot can search for golden records according to your requirements as well as perform actions related to business domains (create, describe, list). ![search.gif](../../assets/images/microsoft-integration/copilot/search.gif) @@ -171,6 +171,6 @@ CluedIn Copilot can search for golden records according to your requirements as |--|--|--| | DataQualityMetrics | Provides current global data quality metrics. | Can you show me the global quality metrics? | | Search | Finds records according to your input.
CluedIn will return the top 10 results in a table and a link to launch into the full search query as well. | Can you find the Person records where the user.country is in the Nordics? | -| ListEntityTypes | Provides a list of all available entity types. Note that entity types that have no associated data will not appear in the list. | Can you list all entity types? | -| CreateEntityType| Creates a new entity type with the specified name. Once a new entity type is created, you'll get a link to view details. | Can you create a new entity type named Company? | -| DescribeEntity | Provides general information about a golden record: entity type, name, codes, properties, vocabularies, and so on.
If you are on the golden record page, you can just tell the Copilot to describe _this_ golden record. | Can you describe this golden record? | +| ListEntityTypes | Provides a list of all available business domains. Note that business domains that have no associated data will not appear in the list. | Can you list all entity types? | +| CreateEntityType| Creates a new business domain with the specified name. Once a new business domaine is created, you'll get a link to view details. | Can you create a new entity type named Company? | +| DescribeEntity | Provides general information about a golden record: business domain, name, codes, properties, vocabularies, and so on.
If you are on the golden record page, you can just tell the Copilot to describe _this_ golden record. | Can you describe this golden record? | diff --git a/docs/180-microsoft-integration/fabric/020-connect-fabric-to-cluedin.md b/docs/180-microsoft-integration/fabric/020-connect-fabric-to-cluedin.md index 619cb65a..97db6b98 100644 --- a/docs/180-microsoft-integration/fabric/020-connect-fabric-to-cluedin.md +++ b/docs/180-microsoft-integration/fabric/020-connect-fabric-to-cluedin.md @@ -159,7 +159,7 @@ After you set up Microsoft Fabric, send the data to CluedIn. This process involv 1. In CluedIn, create auto-mapping for the data set following the instructions [here](/integration/create-mapping). -1. In CluedIn, edit the mapping for the data set to select the property used as the entity name and the property used for the entity origin code. For more information about mapping details, see [Review mapping](/integration/review-mapping). +1. In CluedIn, edit the mapping for the data set to select the property used as the entity name and the property used for the primary identifier. For more information about mapping details, see [Review mapping](/integration/review-mapping). 1. In CluedIn, got to the **Process** tab of the data set, turn on the **Auto submission** toggle, and then select **Switch to Bridge Mode**. diff --git a/docs/180-microsoft-integration/power-automate/020-power-automate-configuration-guide.md b/docs/180-microsoft-integration/power-automate/020-power-automate-configuration-guide.md index ff0ea66d..451abba1 100644 --- a/docs/180-microsoft-integration/power-automate/020-power-automate-configuration-guide.md +++ b/docs/180-microsoft-integration/power-automate/020-power-automate-configuration-guide.md @@ -45,7 +45,7 @@ Make sure that you have completed all of the actions described in [Power Automa - **Enterprise Flow Cache Duration** – a time period for which data is stored in the cache for enterprise flows. This duration can impact the performance and efficiency of your workflows. In the context of Power Automate, the cache duration helps manage the flow’s performance by temporarily storing data to reduce the need for repeated data retrievals. - - **Entity Type Cache Duration** – a time period for which the records that belong to the entity types with the **Batch approval workflow** option enabled are stored in the cache. + - **Business Domain Cache Duration** – a time period for which the records that belong to the business domains with the **Batch approval workflow** option enabled are stored in the cache. ![workflows-cluedin-configuration.png](../../assets/images/microsoft-integration/power-automate/workflows-cluedin-configuration.png) diff --git a/docs/180-microsoft-integration/powerapps/020-features/010-sync-entitytypes-to-dataverse.md b/docs/180-microsoft-integration/powerapps/020-features/010-sync-entitytypes-to-dataverse.md index 54f538c4..7ad1dd5c 100644 --- a/docs/180-microsoft-integration/powerapps/020-features/010-sync-entitytypes-to-dataverse.md +++ b/docs/180-microsoft-integration/powerapps/020-features/010-sync-entitytypes-to-dataverse.md @@ -4,22 +4,22 @@ nav_order: 10 parent: Features grand_parent: Power Apps Integration permalink: /microsoft-integration/powerapps/features/sync-entitytypes -title: Sync entity types to Dataverse tables +title: Sync business domains to Dataverse tables tags: ["integration", "microsoft", "powerapps", "dataverse"] last_modified: 2023-05-17 --- -This feature allows you to sync CluedIn entity types, vocabularies, and vocabulary keys with Dataverse table and columns. +This feature allows you to sync CluedIn business domains, vocabularies, and vocabulary keys with Dataverse table and columns. -**To sync CluedIn entity types with Dataverse table** +**To sync CluedIn business domains with Dataverse table** 1. On the navigation pane, go to **Administration** > **Settings**, and then find the **PowerApps** section. -1. In **Sync CluedIn Entity Types to Dataverse Table**, turn on the toggle, and then enter the entity type that you want to sync. If you want to sync multiple entity types, separate them with a comma (for example, _/_Type1,/Type2,/Type3_). +1. In **Sync CluedIn Business Domains to Dataverse Table**, turn on the toggle, and then enter the business domain that you want to sync. If you want to sync multiple business domains, separate them with a comma (for example, _/_Type1,/Type2,/Type3_). ![Sync Entity Types to Dataverse Tables](../images/sync-cluedin-entitytypes-setting.png) - Another way to enable this feature is to navigate to **Management** > **Entity Types** and select the entity type you want to sync. Then, select **Edit** and turn on the toggle for **Sync CluedIn Entity Types to Dataverse Table**. Finally, save changes. + Another way to enable this feature is to navigate to **Management** > **Business Domains** and select the business domain you want to sync. Then, select **Edit** and turn on the toggle for **Sync CluedIn Business Domains to Dataverse Table**. Finally, save changes. ![Sync Entity Types to Dataverse Tables](../images/sync-cluedin-entitytypes-page-setting.png) diff --git a/docs/180-microsoft-integration/powerapps/020-features/020-sync-dataverse-to-cluedin.md b/docs/180-microsoft-integration/powerapps/020-features/020-sync-dataverse-to-cluedin.md index 559f3158..f0ee5c13 100644 --- a/docs/180-microsoft-integration/powerapps/020-features/020-sync-dataverse-to-cluedin.md +++ b/docs/180-microsoft-integration/powerapps/020-features/020-sync-dataverse-to-cluedin.md @@ -4,12 +4,12 @@ nav_order: 20 parent: Features grand_parent: Power Apps Integration permalink: /microsoft-integration/powerapps/features/sync-dataverse -title: Sync Dataverse table to Cluedin entity types/vocabularies +title: Sync Dataverse table to Cluedin business domains/vocabularies tags: ["integration", "microsoft", "powerapps", "dataverse"] last_modified: 2023-05-17 --- -This feature allows you to sync Dataverse table and columns into CluedIn entity type, vocabulary, and vocabulary keys. +This feature allows you to sync Dataverse table and columns into CluedIn business domains, vocabulary, and vocabulary keys. **Prerequisites** @@ -23,18 +23,18 @@ You'll need to provide the logical name of the Dataverse table. There are the fo ![Identifying Logical Name](../images/dataverse-logical-name.png) -**To sync Dataverse table and columns into CluedIn entity types and vocabulary** +**To sync Dataverse table and columns into CluedIn business domains and vocabulary** 1. On the navigation pane, go to **Administration** > **Settings**, and then find the **PowerApps** section. -1. In **Sync Dataverse Table/Columns to CluedIn Entity Types and Vocabulary**, turn on the toggle, and then enter the Dataverse table name. The value should be the **logical name** of the table. If you want to sync multiple tables, separate them with a comma (for example, _logical_name1_,logical_name2,logical_name3_). +1. In **Sync Dataverse Table/Columns to CluedIn Business Domains and Vocabulary**, turn on the toggle, and then enter the Dataverse table name. The value should be the **logical name** of the table. If you want to sync multiple tables, separate them with a comma (for example, _logical_name1_,logical_name2,logical_name3_). - ![Sync Dataverse Table to Cluedin Entity Types/Vocabularies](../images/sync-dataverse-table-setting.png) + ![sync-dataverse-table.png](../../assets/images/microsoft-integration/power-apps/sync-dataverse-table.png) Once the synchronization has been successfully completed, you'll receive three notifications: **Entity Type Created**, **Vocabulary Created**, and **Vocabulary Keys Created**. ![Sync Dataverse Table Notification](../images/sync-dataverse-table-notification.png) -1. Verify the entity type, vocabulary, and vocabulary keys created in CluedIn. +1. Verify the business domain, vocabulary, and vocabulary keys created in CluedIn. ![Create New EntityType and Vocab](../images/created-new-entitytype-and-vocab.png) \ No newline at end of file diff --git a/docs/180-microsoft-integration/powerapps/020-features/030-create-ingestion-endpoint-workflow.md b/docs/180-microsoft-integration/powerapps/020-features/030-create-ingestion-endpoint-workflow.md index ba0cfe12..87a7a1fb 100644 --- a/docs/180-microsoft-integration/powerapps/020-features/030-create-ingestion-endpoint-workflow.md +++ b/docs/180-microsoft-integration/powerapps/020-features/030-create-ingestion-endpoint-workflow.md @@ -31,7 +31,7 @@ As part of workflow automation, the ingestion endpoint will be created as well. **Workflow** -The creation of workflow will depend on the values of **Sync Entity Types** and **Sync Dataverse Tables**. Once the execution of the job is done, from the sample values above, you can expect two workflows to be created, one for each of the **cluedin_dog** and **crc12_customer** tables. +The creation of workflow will depend on the values of **Sync Business Domains** and **Sync Dataverse Tables**. Once the execution of the job is done, from the sample values above, you can expect two workflows to be created, one for each of the **cluedin_dog** and **crc12_customer** tables. ![Power Automate Workflows](../images/power-automate-workflows.png) @@ -49,7 +49,7 @@ As we already know the structure of the table/vocabulary that we are working on, ![Auto Mapping](../images/ingestion-endpoint-automapping-01.png) -On the **Map** tab, you can find the the full view of all columns mapped to our vocabulary, including edges (relationships) and origin entity code (keys), if there are any. +On the **Map** tab, you can find the the full view of all columns mapped to our vocabulary, including edges (relationships) and identifiers, if there are any. ![Auto Mapping](../images/ingestion-endpoint-automapping-02.png) diff --git a/docs/180-microsoft-integration/powerapps/020-features/040-create-batch-approval-worrkflow.md b/docs/180-microsoft-integration/powerapps/020-features/040-create-batch-approval-worrkflow.md index 5955fea4..18a5815c 100644 --- a/docs/180-microsoft-integration/powerapps/020-features/040-create-batch-approval-worrkflow.md +++ b/docs/180-microsoft-integration/powerapps/020-features/040-create-batch-approval-worrkflow.md @@ -9,7 +9,7 @@ tags: ["integration", "microsoft", "powerautomate", "approval", "workflow"] last_modified: 2023-05-17 --- -This feature enables you to automate the creation of the workflow for the batch approval process. If you process the data (regardless of the source) and the system identifies that the entity type used has been tagged for the approval process, the data will be halted, and the approval process will start and wait for the user's approval to continue the data processing. +This feature enables you to automate the creation of the workflow for the batch approval process. If you process the data (regardless of the source) and the system identifies that the business domain used has been tagged for the approval process, the data will be halted, and the approval process will start and wait for the user's approval to continue the data processing. **Prerequisites** @@ -20,7 +20,7 @@ For more information, refer to this [link](/microsoft-integration/powerapps/setu **To enable the batch approval workflow** -1. In CluedIn, on the navigation pane, go to **Management** > **Entity Types**, and then select the entity type that you want to sync. +1. In CluedIn, on the navigation pane, go to **Management** > **Business Domains**, and then select the business domain that you want to sync. 1. Select **Edit** and then turn on the toggle for **Enable for Batch Approval Workflow**. diff --git a/docs/180-microsoft-integration/powerapps/020-features/050-create-streams.md b/docs/180-microsoft-integration/powerapps/020-features/050-create-streams.md index 6475fe9b..a298e1a0 100644 --- a/docs/180-microsoft-integration/powerapps/020-features/050-create-streams.md +++ b/docs/180-microsoft-integration/powerapps/020-features/050-create-streams.md @@ -27,13 +27,13 @@ Export target will be created automatically using the same credentials from Orga **Streams** -The creation of a stream will depend on the values of **Sync Entity Types** and **Sync Dataverse Tables**. +The creation of a stream will depend on the values of **Sync Business Domains** and **Sync Dataverse Tables**. Once the execution of the job is done, from the sample values above, two streams should have been created, one for each of the **cluedin_dog** and **crc12_customer** tables. ![CluedIn Streams](../images/cluedin-stream.png) -Each stream will have a certain configuration filtered by entity type. +Each stream will have a certain configuration filtered by business domain. ![CluedIn Stream Configuration](../images/cluedin-stream-configuration.png) diff --git a/docs/180-microsoft-integration/powerapps/020-power-apps-configuration-guide.md b/docs/180-microsoft-integration/powerapps/020-power-apps-configuration-guide.md index 1c27e388..da21a09c 100644 --- a/docs/180-microsoft-integration/powerapps/020-power-apps-configuration-guide.md +++ b/docs/180-microsoft-integration/powerapps/020-power-apps-configuration-guide.md @@ -49,27 +49,27 @@ Power Apps integration offers a variety of features for syncing data between Clu **To configure specific features of Power Apps integration** -1. If you want to sync multiple CluedIn entity types to Dataverse tables, specify the **Parallel Execution Count**. This is the number of entity types that can be simultaneously synced with Dataverse. Be default, this number is 5. +1. If you want to sync multiple CluedIn business domains to Dataverse tables, specify the **Parallel Execution Count**. This is the number of business domains that can be simultaneously synced with Dataverse. Be default, this number is 5. -1. If you want to sync CluedIn entity types to Dataverse tables: +1. If you want to sync CluedIn business domains to Dataverse tables: - 1. Turn on the toggle next to **Sync CluedIn Entity Types Dataverse Table**. + 1. Turn on the toggle next to **Sync CluedIn Business Domains to Dataverse Table**. - 1. Enter the entity types that you want to sync. If you want to sync multiple entity types, separate them with a comma (for example, _/Type1,/Type2,/Type3_). + 1. Enter the business domains that you want to sync. If you want to sync multiple business domains, separate them with a comma (for example, _/Type1,/Type2,/Type3_). ![sync-entity-types.png](../../assets/images/microsoft-integration/power-apps/sync-entity-types.png) - Each entity type will be synced into a separate Dataverse table. For more information, see [Sync entity types to Dataverse tables](/microsoft-integration/powerapps/features/sync-entitytypes). + Each business domain will be synced into a separate Dataverse table. For more information, see [Sync business domains to Dataverse tables](/microsoft-integration/powerapps/features/sync-entitytypes). -1. If you want to sync Dataverse tables and columns to CluedIn entity types and vocabulary keys: +1. If you want to sync Dataverse tables and columns to CluedIn business domains and vocabulary keys: - 1. Turn on the toggle next to **Sync Dataverse Table/Columns to CluedIn Entity Types and Vocabulary**. + 1. Turn on the toggle next to **Sync Dataverse Table/Columns to CluedIn Business Domains and Vocabulary**. 1. Enter the name of the Dataverse table that you want to sync. This should be the logical name of the Dataverse table. If you want to sync multiple tables, separate them with a comma (for example, _logical_name1,logical_name2,logical_name3_). ![sync-dataverse-table.png](../../assets/images/microsoft-integration/power-apps/sync-dataverse-table.png) - Each table will be synced into a separate CluedIn entity type and vocabulary associated with that entity type. The columns from the table will be synced into the vocabulary keys of the vocabulary associated with the entity type. For more information, see [Sync Dataverse tables to CluedIn entity types and vocabularies](/microsoft-integration/powerapps/features/sync-dataverse). + Each table will be synced into a separate CluedIn business domain and vocabulary associated with that business domain. The columns from the table will be synced into the vocabulary keys of the vocabulary associated with the business domain. For more information, see [Sync Dataverse tables to CluedIn business domains and vocabularies](/microsoft-integration/powerapps/features/sync-dataverse). 1. If you want to automate the ingestion of data from Dataverse to CluedIn: @@ -87,7 +87,7 @@ Power Apps integration offers a variety of features for syncing data between Clu 1. Make sure you have the [Dataverse export target](/consume/export-targets/dataverse-connector) installed in your CluedIn instance. It should be available in the list of export targets (**Consume** > **Export Targets** > **Add Export Target**). You do not need to configure the Dataverse export target because it will be configured automatically. - 1. Make sure you have enabled **Sync CluedIn Entity Types to Dataverse Table** for entity types of golden records that you want to export to a Dataverse table. + 1. Make sure you have enabled **Sync CluedIn Business Domains to Dataverse Table** for business domains of golden records that you want to export to a Dataverse table. 1. Turn on the toggle next to **Create CluedIn Stream**. diff --git a/docs/180-microsoft-integration/powerapps/040-external-features.md b/docs/180-microsoft-integration/powerapps/040-external-features.md index e8463f7c..39f43aa7 100644 --- a/docs/180-microsoft-integration/powerapps/040-external-features.md +++ b/docs/180-microsoft-integration/powerapps/040-external-features.md @@ -13,7 +13,7 @@ last_modified: 2023-05-17 1. TOC {:toc} -This will enable the user to see the Power Apps and Power Automate features as an iFrame in CluedIn UI. The feature below will only appear on the entity type's main page if that entity type is part of the synchronization. +This will enable the user to see the Power Apps and Power Automate features as an iFrame in CluedIn UI. The feature below will only appear on the business domain's main page if that business domain is part of the synchronization. ### Dataverse data diff --git a/docs/180-microsoft-integration/powerapps/images/sync-cluedin-entitytypes-setting.png b/docs/180-microsoft-integration/powerapps/images/sync-cluedin-entitytypes-setting.png index b864b081..779197e1 100644 Binary files a/docs/180-microsoft-integration/powerapps/images/sync-cluedin-entitytypes-setting.png and b/docs/180-microsoft-integration/powerapps/images/sync-cluedin-entitytypes-setting.png differ diff --git a/docs/200-kb/how-to/010-fix-address-data.md b/docs/200-kb/how-to/010-fix-address-data.md index 6627c88c..27864f3c 100644 --- a/docs/200-kb/how-to/010-fix-address-data.md +++ b/docs/200-kb/how-to/010-fix-address-data.md @@ -45,11 +45,11 @@ Let's consider an example of a CluedIn golden record that contains the address p ![libpostal-address-input.png](../../assets/images/kb/how-to/libpostal-address-input.png) -To configure the Libpostal enricher, specify the entity type of golden records that you want to enrich and the vocabulary key that contains the initial address. +To configure the Libpostal enricher, specify the business domain of golden records that you want to enrich and the vocabulary key that contains the initial address. ![libpostal-settings.png](../../assets/images/kb/how-to/libpostal-settings.png) -When the enricher is configured, [trigger](/preparation/enricher/add-enricher#trigger-enrichment) the enrichment for golden records. You can trigger the enrichment for all golden records of the specific entity type using the [GraphQL tool](/consume/graphql/graphql-actions). Alternatively, you can trigger the enrichment manually for each golden record. To do this, on the golden record page, select **More** > **Trigger external enrichment**. +When the enricher is configured, [trigger](/preparation/enricher/add-enricher#trigger-enrichment) the enrichment for golden records. You can trigger the enrichment for all golden records of the specific business domain using the [GraphQL tool](/consume/graphql/graphql-actions). Alternatively, you can trigger the enrichment manually for each golden record. To do this, on the golden record page, select **More** > **Trigger external enrichment**. ![trigger-external-enrichment.png](../../assets/images/kb/how-to/trigger-external-enrichment.png) @@ -84,7 +84,7 @@ To configure the Google Maps enricher, specify the following details: - **API Key** for retrieving information from the Google Maps Platform. -- **Accepted Entity Type** to define which golden records will be enriched. +- **Accepted Business Domain** (previously entity type) to define which golden records will be enriched. - **Vocabulary Key used to control whether it should be enriched** to indicate if the golden record should be enriched. If the value of the vocabulary key is _true_, then the golden record will be enriched. Otherwise, the golden record will not be enriched. @@ -94,7 +94,7 @@ To configure the Google Maps enricher, specify the following details: ![google-maps-settings.png](../../assets/images/kb/how-to/google-maps-settings.png) -When the enricher is configured, [trigger](/preparation/enricher/add-enricher#trigger-enrichment) the enrichment for golden records. You can trigger the enrichment for all golden records of the specific entity type using the [GraphQL tool](/consume/graphql/graphql-actions), or you can trigger the enrichment manually for each golden record. Alternatively, you can trigger the enrichment manually for each golden record. To do this, on the golden record page, select **More** > **Trigger external enrichment**. +When the enricher is configured, [trigger](/preparation/enricher/add-enricher#trigger-enrichment) the enrichment for golden records. You can trigger the enrichment for all golden records of the specific business domain using the [GraphQL tool](/consume/graphql/graphql-actions), or you can trigger the enrichment manually for each golden record. Alternatively, you can trigger the enrichment manually for each golden record. To do this, on the golden record page, select **More** > **Trigger external enrichment**. As a result, the CluedIn golden record now contains a variety of company details from Google Maps, such as administrative area level 1 and 2, business status, country code, and more. diff --git a/docs/200-kb/kb0010-faq.md b/docs/200-kb/kb0010-faq.md index 48e3c134..cd6503a8 100644 --- a/docs/200-kb/kb0010-faq.md +++ b/docs/200-kb/kb0010-faq.md @@ -70,7 +70,7 @@ The good news is that the native CluedIn administration screen is built on top o ## I have made a change in CluedIn that requires me to re-process all the data. What do I do? -It will happen many times with your CluedIn account! It is highly typical and expected. CluedIn has been designed with the idea in mind that we will re-process the data all the time. Once you have made your change, you have quite a lot of control over the level of re-processing. You can re-process at a source level, globally, record level, or even something a little more custom, e.g., entity type level. +It will happen many times with your CluedIn account! It is highly typical and expected. CluedIn has been designed with the idea in mind that we will re-process the data all the time. Once you have made your change, you have quite a lot of control over the level of re-processing. You can re-process at a source level, globally, record level, or even something a little more custom, e.g., business domain level. ## Does CluedIn index everything? diff --git a/docs/210-playbooks/008-data-export-playbook.md b/docs/210-playbooks/008-data-export-playbook.md index 82ed13f0..da19a671 100644 --- a/docs/210-playbooks/008-data-export-playbook.md +++ b/docs/210-playbooks/008-data-export-playbook.md @@ -28,7 +28,7 @@ title: Data export playbook Now that you have completed the [data transformation](/playbooks/data-transformation-playbook) process and reached the desired quality of your [golden records](/key-terms-and-features/golden-records), it is time to **push golden records to the target systems** in your organization. {:.important} -To establish correlations between exported golden records in your target systems, use the [entity origin code (primary identifier)](/key-terms-and-features/entity-codes#entity-origin-code-primary-identifier) and [codes (identifiers)](/key-terms-and-features/entity-codes#entity-codes-identifiers) instead of the entity ID. This is because the entity ID does not guarantee uniqueness, as records with different IDs could be merged. +To establish correlations between exported golden records in your target systems, use the [primary identifier](/key-terms-and-features/entity-codes#primary-identifier) and [additional identifiers](/key-terms-and-features/entity-codes#identifiers) instead of the entity ID. This is because the entity ID does not guarantee uniqueness, as records with different IDs could be merged. ## Data export models diff --git a/docs/210-playbooks/data-engineering/002-extending-cluedin-with-ms-integrations.md b/docs/210-playbooks/data-engineering/002-extending-cluedin-with-ms-integrations.md index 5845bd19..b7d2f22f 100644 --- a/docs/210-playbooks/data-engineering/002-extending-cluedin-with-ms-integrations.md +++ b/docs/210-playbooks/data-engineering/002-extending-cluedin-with-ms-integrations.md @@ -59,7 +59,7 @@ If you use Fabric, Databricks, Snowflake, Synapse, or any other Python-based dat ## Event Hubs integration -[Azure Event Hubs](https://learn.microsoft.com/en-us/azure/event-hubs/) is a native data-streaming service in the cloud that can stream millions of events per second, with low latency, from any source to any destination. By integrating Azure Event Hubs and CluedIn, you can enable the transmission of important events from CluedIn to your event hub. For example, adding or updating a data source, adding an entity type, creating a deduplication project, and more. You can then connect your event hub to Fabric and **build reports or alerts for different events**. This way you can get valuable insights about the activities in CluedIn, such as the number of deduplication or clean projects that were created last week. For more information, see [Event Hub integration](/microsoft-integration/event-hub-integration). +[Azure Event Hubs](https://learn.microsoft.com/en-us/azure/event-hubs/) is a native data-streaming service in the cloud that can stream millions of events per second, with low latency, from any source to any destination. By integrating Azure Event Hubs and CluedIn, you can enable the transmission of important events from CluedIn to your event hub. For example, adding or updating a data source, adding a business domain, creating a deduplication project, and more. You can then connect your event hub to Fabric and **build reports or alerts for different events**. This way you can get valuable insights about the activities in CluedIn, such as the number of deduplication or clean projects that were created last week. For more information, see [Event Hub integration](/microsoft-integration/event-hub-integration). In addition to this, CluedIn offers the [Azure Event Hub connector](/consume/export-targets/azure-event-hub-connector). You can configure it and use it in a stream to **export golden records** to a specific event hub. diff --git a/docs/210-playbooks/data-engineering/020-search.md b/docs/210-playbooks/data-engineering/020-search.md index 2234aa56..c86d1059 100644 --- a/docs/210-playbooks/data-engineering/020-search.md +++ b/docs/210-playbooks/data-engineering/020-search.md @@ -45,7 +45,7 @@ To find them, I can run a query like this: ``` The query will return me the top 20 of the `/Duck` entities. -The `query` parameter tells the API to filter the response by a given Entity Type. +The `query` parameter tells the API to filter the response by a given business domain (previously entity type). You can also specify the entity properties you want to get in the payload: `id`, `name`, and `entityType`. ### GraphQL search query with variables and cursor diff --git a/docs/210-playbooks/data-engineering/040-ingestion.md b/docs/210-playbooks/data-engineering/040-ingestion.md index 0dc94dd8..f8bd431e 100644 --- a/docs/210-playbooks/data-engineering/040-ingestion.md +++ b/docs/210-playbooks/data-engineering/040-ingestion.md @@ -28,7 +28,7 @@ We want to load this data to CluedIn. To do that, we need to create an API token CluedIn API token -Next, create an endpoint in CluedIn. From CluedIn's main page, click "Import From Ingestion Endpoint" and create a new endpoint. You will need to enter the endpoint's name, group name, and select entity type: +Next, create an endpoint in CluedIn. From CluedIn's main page, click "Import From Ingestion Endpoint" and create a new endpoint. You will need to enter the endpoint's name, group name, and select business domain (previouisly entity type): Ingestion Endpoint @@ -163,7 +163,7 @@ Then, in the Map tab, we create an automatic mapping: 1. Click "Add Mapping". 2. Select "Auto Mapping" and "Next". -3. Ensure the entity type is selected or type a new entity type name and click "Create". +3. Ensure the business domain (previously entity type) is selected or type a new business domain name and click "Create". 4. Type the new vocabulary name, like `imdb.title` and click "Create". 5. Click "Create Mapping". diff --git a/docs/210-playbooks/data-ingestion/004-concept-of-mapping.md b/docs/210-playbooks/data-ingestion/004-concept-of-mapping.md index 9b6e4e76..d1482e89 100644 --- a/docs/210-playbooks/data-ingestion/004-concept-of-mapping.md +++ b/docs/210-playbooks/data-ingestion/004-concept-of-mapping.md @@ -39,10 +39,10 @@ The mapping uses multiple CluedIn terms that you will need to learn. While it ma | Name | Purpose | Link to documentation | |--|--|--| -| Entity type | The entity type represents a specific business domain of your data. Read the documentation to understand how to choose a good entity type. | [Link](/key-terms-and-features/entity-type) | +| Business domain | The business domain represents a specific business object that describes the semantic meaning of your data. Read the documentation to understand how to choose a good business domain. | [Link](/key-terms-and-features/entity-type) | | Vocabulary and vocabulary keys | The vocabulary is used to define the semantic layer (metadata) for your data. The vocabulary contains vocabulary keys that describe the properties coming in from the data source. Read the documentation to learn about vocabulary usage. | [Link](/key-terms-and-features/vocabularies) | -| Codes (identifiers) | This is a mechanism that CluedIn uses to define the uniqueness of a golden record. Read the documentation to understand the concept of origin code (primary identifier) and codes (identifiers). | [Link](/key-terms-and-features/entity-codes) | -| Origin | The origin generally determines the source of golden records, and it is used in codes (identifiers). Read the documentation to understand the importance of the origin. | [Link](/key-terms-and-features/origin) | +| Identifiers | This is a mechanism that CluedIn uses to define the uniqueness of a golden record. Read the documentation to understand the concept of primary identifier and identifiers. | [Link](/key-terms-and-features/entity-codes) | +| Origin | The origin generally determines the source of golden records, and it is used in identifiers. Read the documentation to understand the importance of the origin. | [Link](/key-terms-and-features/origin) | ## Mapping process overview @@ -52,19 +52,19 @@ To get a good default mapping configuration, use the **auto-mapping** option. ![mapping-type.png](../../assets/images/playbooks/mapping-type.png) -It is a great way to start and define the most important mapping attributes—**entity type** and **vocabulary**. You can learn more about the process of creating mapping and find step-by-step instructions in a dedicated [article](/integration/create-mapping). +It is a great way to start and define the most important mapping attributes—**business domain** and **vocabulary**. You can learn more about the process of creating mapping and find step-by-step instructions in a dedicated [article](/integration/create-mapping). For lineage purposes, we recommend **keeping a vocabulary close to the source**. Later, you can map the vocabulary keys to generic, shared vocabulary keys. When you have multiple sources with vocabulary keys mapped directly to the generic, shared vocabulary keys, it can become overwhelming to have more than 10 sources mapping directly to your golden records. In order to avoid confusion for those consuming the records, it is a good practice to map to the source vocabulary first. ![configure-mapping.png](../../assets/images/playbooks/configure-mapping.png) -## Setting up the right codes (identifiers) +## Setting up the right identifiers -After you selected the right entity type and vocabulary, it's time to choose the right key to produce the codes for your records. +After you selected the right business domain and vocabulary, it's time to choose the right key to produce the identifiers for your records. -Poorly defined codes can have a truly unexpected impact. The most common pitfall of poorly defined codes is what we call _over-merging_. It happens when you set up a code that is not actually unique. Suppose you choose the country code as a key, then all records with the same country code will merge together into one record. For example, if you have 100,000 records with the country code of "DK", then all those 100,000 records will end up as 1 record. +Poorly defined identifiers can have a truly unexpected impact. The most common pitfall of poorly defined identifiers is what we call _over-merging_. It happens when you set up an identifier that is not actually unique. Suppose you choose the country code as a key, then all records with the same country code will merge together into one record. For example, if you have 100,000 records with the country code of "DK", then all those 100,000 records will end up as 1 record. -Poorly defined codes can cause system slowdowns. While **reverting is possible, it will take time** as now CluedIn will need to _split_ those records. At that point, the fastest solution is to remove those records from CluedIn and restart the mapping. With the country code example, it is easy to understand that using non-unique properties as identifiers is not a good choice. +Poorly defined identifiers can cause system slowdowns. While **reverting is possible, it will take time** as now CluedIn will need to _split_ those records. At that point, the fastest solution is to remove those records from CluedIn and restart the mapping. With the country code example, it is easy to understand that using non-unique properties as identifiers is not a good choice. Sometimes, the key you consider unique is not in fact unique. For example, a SKU code—unique internal product code—can have the following issues: @@ -86,11 +86,11 @@ Of course, CluedIn has ways to fix such data quality issues. However, if you bli This is an introductory article on the topic of mapping. We'll add more detailed articles soon, as mapping is a significant part of the processes in CluedIn. As an outcome of this article, you know about 2 important decisions you have to make while creating mapping: -- Choose the **right entity type** (business domain). +- Choose the **right business domain**. - Choose the **right vocabulary** that is close to the source. -Additionally, you now know about the importance of choosing the **unique key** for producing the codes (identifiers). If you are unsure whether the selected key is unique, you can use a deduplication project to detect and merge duplicates. +Additionally, you now know about the importance of choosing the **unique key** for producing the identifiers. If you are unsure whether the selected key is unique, you can use a deduplication project to detect and merge duplicates. ## Next step