-
Notifications
You must be signed in to change notification settings - Fork 126
fix: Only strip primary keys when they are fixed chars, not for varchars #1472
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
/gcbrun |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few suggestions
docs/internal/strings.md
Outdated
|
||
DVT correctly handles Teradata strings encoded in ISO-8859 (Latin) encoding [PR 1226](https://github.com/GoogleCloudPlatform/professional-services-data-validator/pull/1226) | ||
|
||
## Handling NCHAR and NVARCHAR data types in MS SQL and Oracle |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use "SQL Server" instead of "MS SQL". We all know what "MS SQL" means but I don't think it is an official product name.
@@ -141,6 +140,31 @@ INSERT INTO `pso_data_validator`.`dvt_char_id` VALUES | |||
('DVT4 ', 'Row 4 '), | |||
('DVT5 ', 'Row 5'); | |||
|
|||
# BigQuery does not have a fixed char data type and blanks at the end of strings are not removed by the database. | |||
# Other databases have a fixed char data type and these tables are to allow the comparison of those columns with BQ and each other | |||
CREATE OR REPLACE TABLE `pso_data_validator`.`dvt_varchar_id` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have test table dvt_string_id
. Do we either not need dvt_varchar_id
or need to remove dvt_string_id
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me the tables dvt_string_id
and dvt_char_id
were confusing - not clear about the purpose of each. So I created new tables with names that are pretty clear. Also the data in those tables assumed you were stripping string primary keys. The data in the dvt_varchar_id
includes a row with a primary key with a trailing blank. This would have been stripped in the earlier design. Now it will not.
Yes - maybe in a few months, we can remove the tables dvt_string_id
and dvt_char_id
from our databases.
Thank you.
Sundar Mudupalli
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You've removed the tests that use dvt_string_id
so I assumed the new table was a replacement for the old one. If they serve different purposes should the old tests be left in place? If the table is redundant we can remove it from the source code but not actually drop it until your changes reach develop
.
@@ -258,6 +258,32 @@ INSERT INTO pso_data_validator.dvt_char_id VALUES ('DVT4', 'Row 4'); | |||
INSERT INTO pso_data_validator.dvt_char_id VALUES ('DVT5', 'Row 5'); | |||
COMMIT; | |||
|
|||
DROP TABLE pso_data_validator.dvt_fixed_char_id; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as for BigQuery, the new tables appear to be duplicates of existing tables.
…moved, resulting in one fewer line output; No harm in reading that property from old yaml files, it is no longer used
/gcbrun |
/gcbrun |
/gcbrun |
Description of changes
DVT correctly handles character (fixed or varchar) primary keys - specifically for Oracle, Postgres and Teradata. As noted in issue 1470, DVT does not handle fixed length chars primary keys for SQL Server and Db2. Random rows don't work well with fixed length char primary keys.
This change also deprecates the
-tsp
option - it is removed from the documentation and users get a warning.Issues to be closed
Closes #1368
Checklist
CONTRIBUTING
Guide.tests/local_check.sh
script)