From 370240cf04ad9d407608b9e8ef764ac5e2cac11e Mon Sep 17 00:00:00 2001 From: Harsh Joshi Date: Tue, 8 Apr 2025 15:38:52 +0530 Subject: [PATCH 1/4] Add smart hashing documentation in readme --- README.md | 67 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 67 insertions(+) diff --git a/README.md b/README.md index a005b8746d..0a496449e2 100644 --- a/README.md +++ b/README.md @@ -960,6 +960,73 @@ There are a few subtle differences from the Fetch API which are meant to limit t - some options and behaviors are not applicable to Node.js and will be ignored by `node-fetch`. See this list of [known differences](https://github.com/node-fetch/node-fetch/blob/1780f5ae89107ded4f232f43219ab0e548b0647c/docs/v2-LIMITS.md). - `method` will automatically get upcased for consistency. +## Hashing PII with `processHashing` + +Use the `processHashing` utility to hash Personally Identifiable Information (PII) such as email addresses and phone numbers in a consistent, secure, and smart way. + +It ensures: + +- Consistent hashing across use cases +- Prevention of double-hashing (detects if a value is already hashed) +- Optional normalization of input (e.g., trimming, removing special characters) +- Rollout control via feature flags or bypass slug support + +#### Function Signature + +```ts +processHashing( + value: string, + encryptionMethod: EncryptionMethod, // e.g., 'sha256' + digest: DigestType, // e.g., 'hex' or 'base64' + features: Features | undefined, // Optional: feature flag object + destinationSlugForBypass: string, // Optional: bypasses flag check for listed slugs + cleaningFunction?: (value: string) => string // Optional: custom normalization function +): string +``` + +#### Example 1: Hashing an Email Address + +```ts +import { processHashing } from 'destination-actions/lib/hashing-utils' + +const email = ' Person@email.com ' +const hashedEmail = processHashing(email, 'sha256', 'hex', undefined, '', (value) => value.trim().toLowerCase()) + +console.log(hashedEmail) // hashed string +``` + +#### Example 2: Hashing a Phone Number + +```ts +const phone = '+1(706)-767-5127' +const normalizePhone = (value: string) => value.replace(/[^0-9]/g, '') + +const hashedPhone = processHashing(phone, 'sha256', 'hex', undefined, '', normalizePhone) + +console.log(hashedPhone) // hashed string +``` + +#### Internals and Behavior + +- **Empty string**: Returns `''` if input is empty or just whitespace. +- **Smart hashing**: + - If the value **looks already hashed** (length + format), it returns as-is. + - Otherwise, it hashes the cleaned value. +- **Feature flag check**: + - If `features['smart-hashing']` is `true`, it uses smart detection, otherwise it will just hash and return. + - If the `destinationSlugForBypass` is in the internal `slugsToBypassFeatureFlag` list, the feature flag is skipped and smart hashing is enabled by default. + +#### Supported Hash Algorithms + +- `md5` +- `sha1` +- `sha224` +- `sha256` ✅ recommended +- `sha384` +- `sha512` + +Each supports output in either `hex` or `base64` digest format. + ## Support For any issues, please contact our support team at partner-support@segment.com. From 6a51361ab1f749842bb802da8ec6fa7022cec148 Mon Sep 17 00:00:00 2001 From: Harsh Joshi Date: Wed, 9 Apr 2025 15:28:24 +0530 Subject: [PATCH 2/4] Update smart hashing documentation --- README.md | 83 +++++++++++++++++++++++++++---------------------------- 1 file changed, 41 insertions(+), 42 deletions(-) diff --git a/README.md b/README.md index 0a496449e2..a2b877a583 100644 --- a/README.md +++ b/README.md @@ -960,72 +960,67 @@ There are a few subtle differences from the Fetch API which are meant to limit t - some options and behaviors are not applicable to Node.js and will be ignored by `node-fetch`. See this list of [known differences](https://github.com/node-fetch/node-fetch/blob/1780f5ae89107ded4f232f43219ab0e548b0647c/docs/v2-LIMITS.md). - `method` will automatically get upcased for consistency. -## Hashing PII with `processHashing` +## Improved Hashing Detection with `processHashing` -Use the `processHashing` utility to hash Personally Identifiable Information (PII) such as email addresses and phone numbers in a consistent, secure, and smart way. +Use the `processHashing` utility to hash Personally Identifiable Information (PII)—like email addresses and phone numbers. -It ensures: +This utility simplifies workflows by automatically detecting and handling pre-hashed values, ensuring compatibility across destinations and preventing common data-matching issues. -- Consistent hashing across use cases -- Prevention of double-hashing (detects if a value is already hashed) -- Optional normalization of input (e.g., trimming, removing special characters) -- Rollout control via feature flags or bypass slug support +### Key Benefits -#### Function Signature +- **Automatic Hashing Detection**: Avoids double-hashing by identifying already hashed values. +- **Consistent Hashing**: Applies the correct algorithm and format across use cases. +- **Optional Input Normalization**: Supports custom logic for cleaning and standardizing inputs (e.g., trimming, formatting). -```ts -processHashing( - value: string, - encryptionMethod: EncryptionMethod, // e.g., 'sha256' - digest: DigestType, // e.g., 'hex' or 'base64' - features: Features | undefined, // Optional: feature flag object - destinationSlugForBypass: string, // Optional: bypasses flag check for listed slugs - cleaningFunction?: (value: string) => string // Optional: custom normalization function -): string -``` - -#### Example 1: Hashing an Email Address +### Example 1: Hashing an Email Address -```ts -import { processHashing } from 'destination-actions/lib/hashing-utils' +``` + import { processHashing } from 'destination-actions/lib/hashing-utils' -const email = ' Person@email.com ' -const hashedEmail = processHashing(email, 'sha256', 'hex', undefined, '', (value) => value.trim().toLowerCase()) + const email = ' Person@email.com ' + const hashedEmail = processHashing( + email, + 'sha256', + 'hex', + (value) => value.trim().toLowerCase() + ) -console.log(hashedEmail) // hashed string + console.log(hashedEmail) // hashed string ``` -#### Example 2: Hashing a Phone Number +### Example 2: Hashing a Phone Number -```ts -const phone = '+1(706)-767-5127' -const normalizePhone = (value: string) => value.replace(/[^0-9]/g, '') +``` + const phone = '+1(706)-767-5127' + const normalizePhone = (value: string) => value.replace(/[^0-9]/g, '') -const hashedPhone = processHashing(phone, 'sha256', 'hex', undefined, '', normalizePhone) + const hashedPhone = processHashing( + phone, + 'sha256', + 'hex', + normalizePhone + ) -console.log(hashedPhone) // hashed string + console.log(hashedPhone) // hashed string ``` -#### Internals and Behavior +### Notes -- **Empty string**: Returns `''` if input is empty or just whitespace. -- **Smart hashing**: - - If the value **looks already hashed** (length + format), it returns as-is. - - Otherwise, it hashes the cleaned value. -- **Feature flag check**: - - If `features['smart-hashing']` is `true`, it uses smart detection, otherwise it will just hash and return. - - If the `destinationSlugForBypass` is in the internal `slugsToBypassFeatureFlag` list, the feature flag is skipped and smart hashing is enabled by default. +**Empty Input Handling**: Returns `''` for empty or whitespace-only strings. -#### Supported Hash Algorithms +**Supported Hash Algorithms** - `md5` - `sha1` - `sha224` -- `sha256` ✅ recommended +- `sha256` - `sha384` - `sha512` -Each supports output in either `hex` or `base64` digest format. +All algorithms support `hex` and `base64` digest formats. + +**Requesting Additional Algorithms** +To request additional hash algorithms, contact partner-support@segment.com. ## Support @@ -1058,3 +1053,7 @@ SOFTWARE. ## Contributing All third party contributors acknowledge that any contributions they provide will be made under the same open source license that the open source project is provided under. + +``` + +``` From 4b43e09b1ad0904c21acbadb64daf4a12fcdf037 Mon Sep 17 00:00:00 2001 From: Harsh Joshi Date: Wed, 9 Apr 2025 15:31:31 +0530 Subject: [PATCH 3/4] update heading --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index a2b877a583..25538890bf 100644 --- a/README.md +++ b/README.md @@ -960,7 +960,7 @@ There are a few subtle differences from the Fetch API which are meant to limit t - some options and behaviors are not applicable to Node.js and will be ignored by `node-fetch`. See this list of [known differences](https://github.com/node-fetch/node-fetch/blob/1780f5ae89107ded4f232f43219ab0e548b0647c/docs/v2-LIMITS.md). - `method` will automatically get upcased for consistency. -## Improved Hashing Detection with `processHashing` +## Automatic Hashing Detection with `processHashing` Use the `processHashing` utility to hash Personally Identifiable Information (PII)—like email addresses and phone numbers. From 6a526d91a605ec3b22123ed501877d13bce41223 Mon Sep 17 00:00:00 2001 From: Harsh Joshi Date: Sun, 13 Apr 2025 10:58:29 +0530 Subject: [PATCH 4/4] update docs --- README.md | 25 +++---------------------- 1 file changed, 3 insertions(+), 22 deletions(-) diff --git a/README.md b/README.md index 25538890bf..33366e1897 100644 --- a/README.md +++ b/README.md @@ -962,15 +962,11 @@ There are a few subtle differences from the Fetch API which are meant to limit t ## Automatic Hashing Detection with `processHashing` -Use the `processHashing` utility to hash Personally Identifiable Information (PII)—like email addresses and phone numbers. +Our popular segment Adtect destinations support [automatic hash detection](https://segment.com/docs/connections/destinations/#hashing) of personally identifyable information (PII). If your destination hashes PII data, we recommend you use the `processHashing` utility instead of `createHash` from `crypto` module. -This utility simplifies workflows by automatically detecting and handling pre-hashed values, ensuring compatibility across destinations and preventing common data-matching issues. +The `processHashing` utility supports `md5`, `sha1`,`sha224`,`sha256`,`sha384` and`sha512` hashing algorithms. It can output digests in `hex` or `base64` format. -### Key Benefits - -- **Automatic Hashing Detection**: Avoids double-hashing by identifying already hashed values. -- **Consistent Hashing**: Applies the correct algorithm and format across use cases. -- **Optional Input Normalization**: Supports custom logic for cleaning and standardizing inputs (e.g., trimming, formatting). +**Note**: For empty or whitespace-only strings, the `processHashing` outputs an empty string instead of throwing an error like `createHash` hash module. ### Example 1: Hashing an Email Address @@ -1004,21 +1000,6 @@ This utility simplifies workflows by automatically detecting and handling pre-ha console.log(hashedPhone) // hashed string ``` -### Notes - -**Empty Input Handling**: Returns `''` for empty or whitespace-only strings. - -**Supported Hash Algorithms** - -- `md5` -- `sha1` -- `sha224` -- `sha256` -- `sha384` -- `sha512` - -All algorithms support `hex` and `base64` digest formats. - **Requesting Additional Algorithms** To request additional hash algorithms, contact partner-support@segment.com.