Skip to content

Conversation

rettinghaus
Copy link
Contributor

This PR adds a full stop to (almost) all Italian descriptions (as was done for other languages in #2578). It also makes some minor corrections for typos in the relevant language.

@sydb
Copy link
Member

sydb commented Jun 2, 2025

@raffazizzi — The only files that have changes to the Italian content other than an ultimate period are listed below. Some of the changes may deserve an update to @versionDate, but none have had it.

  • Source/Specs/TEI.xml
  • Source/Specs/abbr.xml
  • Source/Specs/arc.xml
  • Source/Specs/att.citing.xml
  • Source/Specs/att.datable.iso.xml
  • Source/Specs/att.measurement.xml
  • Source/Specs/att.repeatable.xml
  • Source/Specs/dimensions.xml
  • Source/Specs/msDesc.xml
  • Source/Specs/node.xml
  • Source/Specs/postscript.xml
  • Source/Specs/schemas.xml
  • Source/Specs/teiCorpus.xml
  • Source/Specs/usg.xml
  • Source/Specs/zone.xml

how I obtained that list (in case anyone is curious)

  • Checked out our dev branch into a directory, call it A/.
  • Checked out @rettinghaus’ branch into a different directory, call it B/. (I had to find instructions on how to do this on the web — Stackoverflow, I think — because even though I have done it a few times before, I can never get the syntax right.)
  • From directory A/P5/, issued for f in Source/Specs/*.xml ; do echo ; echo "---------$f:" ; diff -q <(perl -pe 's,\.</desc,</desc,;' $f) <(perl -pe 's,\.</desc,</desc,;' [path_to_B]/P5/$f) ; done | fgrep 'differ'.
  • Hand-tweaked the results so that they are readable. (I.e., the equivalent of replace( $line, '^---------([^:]+):Files.*differ$', '* $1').)

Note that this only works because none of our Spec files has a name that includes the string "differ". If there were, would have to use a smarter command than just fgrep 'differ' at the end, there.

how I checked @versionDate values (in case anyone is curious)

xmlstarlet sel -N t=http://www.tei-c.org/ns/1.0 -t -m "//t:desc[@xml:lang='it']" -v "@versionDate" -n  [list of files from above] |sort | uniq -c | sort -nrs

@ebeshero
Copy link
Member

ebeshero commented Jul 31, 2025

@sydb In case you are wondering if anyone was curious how you accomplished your diffing, I was, but the shell script was a bit impenetrable. After some digging I discovered that what you’re doing is a pretty complicated diff, and it would help us greatly to explain in a sentence or two what you’re doing before just pasting in the code like a recipe. I get that the recipe works for this particular PR, but we might want to follow the logic of it in future.

If I understand this correctly, you’re evaluating the differences between the original Specs and those that @rettinghaus modified, to make sure we are aware of any changes that are not just the addition of terminal periods (and a new version of your script would likely include closing parentheses). So you normalize the files in both directories by removing all the periods (and probably end parentheses).

We need to know something about diff -q to understand what you’re doing with “differ” I think. Isn’t this the “quiet” mode of diff? Does it output the word “differ” as an announcement that a file is altered? If that’s the case, it should work fine even if the word “differ” is in the spec file name, shouldn’t it? (OK, it might need an adjustment to find the lines of output starting with "differ ")

This approach to studying directory changes is something we need like a useful “Swiss Army Knife” tool in Council’s kit, and while I realize explaining one’s clever solutions is entirely voluntary, this is such a useful thing to know how to do that explaining the “big picture” strategy will help all of us who don’t natively speak your shell. (When I do this sort of thing, by the way, I use XQuery over two XML collections, normalize with an XPath function like replace() and basically proceed in a different idiom. So it helps me to know what you’re trying to say in your idiom. :-)

@sydb
Copy link
Member

sydb commented Aug 10, 2025

@ebeshero — Fair point, although I generally think something is better than nothing, especially here on the ticket. (This latest discussion you & I are having does not really belong on this ticket, but it is rooted here, and we can copy-and-paste the important bits elsewhere.)
As we are up against deadline I am not going to detail the shell commands, above, but will say that you got it right, diff -q uses the word "differ" when two files are different. IIRC, diff -q dirA/ dirB/ only produces 4 messages:

  1. Common subdirectories: dirA/subdir_a and dirB/subdir_b
  2. Only in dirA: file_alpha.xml
  3. Only in dirB: file_beta.xml
  4. Files dirA/not_the_same.xml and dirB/not_the_same.xml differ
    Personally I find the syntax of #​4 really annoying: you can’t just type diff and then copy-and-paste the two filenames from the message to your commandline, because the word “and” is in the middle.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants