Skip to content

Deprecate HTML4-error-reporting methods/classes and update pom.xml for the 1.5 release #22

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: release-1.5
Choose a base branch
from
Open
25 changes: 9 additions & 16 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@
<groupId>nu.validator.htmlparser</groupId>
<artifactId>htmlparser</artifactId>
<packaging>bundle</packaging>
<version>1.4</version>
<version>1.5</version>
<name>htmlparser</name>
<url>http://about.validator.nu/htmlparser/</url>
<description>The Validator.nu HTML Parser is an implementation of the HTML5 parsing algorithm in Java for applications. The parser is designed to work as a drop-in replacement for the XML parser in applications that already support XHTML 1.x content with an XML parser and use SAX, DOM or XOM to interface with the parser.</description>
Expand Down Expand Up @@ -69,8 +69,8 @@
</license>
</licenses>
<scm>
<connection>scm:hg:http://hg.mozilla.org/projects/htmlparser/</connection>
<url>http://hg.mozilla.org/projects/htmlparser/</url>
<connection>scm:git:https://github.com/validator/htmlparser.git</connection>
<url>https://github.com/validator/htmlparser</url>
</scm>
<build>
<sourceDirectory>${project.build.directory}/src</sourceDirectory>
Expand All @@ -80,22 +80,13 @@
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>1.5</source>
<target>1.5</target>
<source>1.7</source>
<target>1.7</target>
</configuration>
</plugin>
<plugin>
<artifactId>maven-antrun-plugin</artifactId>
<version>1.7</version>
<dependencies>
<dependency>
<groupId>com.sun</groupId>
<artifactId>tools</artifactId>
<version>1.5.0</version>
<scope>system</scope>
<systemPath>${java.home}/../lib/tools.jar</systemPath>
</dependency>
</dependencies>
<executions>
<execution>
<id>intitialize-sources</id>
Expand All @@ -121,6 +112,7 @@
</goals>
<configuration>
<target>
<property name="build.compiler" value="extJavac"/>
<property name="translator.sources" value="${basedir}/translator-src"/>
<property name="translator.classes" value="${project.build.directory}/translator-classes"/>
<mkdir dir="${translator.classes}"/>
Expand Down Expand Up @@ -157,7 +149,8 @@
<Bundle-Name>${project.name}</Bundle-Name>
<Bundle-SymbolicName>nu.validator.htmlparser</Bundle-SymbolicName>
<Bundle-Version>${project.version}</Bundle-Version>
<Bundle-RequiredExecutionEnvironment>J2SE-1.5</Bundle-RequiredExecutionEnvironment>
<Bundle-RequiredExecutionEnvironment>JavaSE-1.7</Bundle-RequiredExecutionEnvironment>
<Automatic-Module-Name>nu.validator.htmlparser</Automatic-Module-Name>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As explained in #17 (comment), I believe this should be reverted.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with this, no need to revert (see #26)

<_removeheaders>Built-By,Bnd-LastModified</_removeheaders>
</instructions>
</configuration>
Expand Down Expand Up @@ -214,7 +207,7 @@
<dependency>
<groupId>xom</groupId>
<artifactId>xom</artifactId>
<version>1.1</version>
<version>[1.3.5,)</version>
Comment on lines -217 to +210
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As explained in #17 (comment), I believe this should be reverted.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with the dependency upgrade (see #26)

However, I hold that the use of version ranges should be reverted, as it's not a common Maven practice. See here for a typical Maven example. As already mentioned here, I believe fixed versions should be used, among other things "for the sake of build reproducibility and stability".

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

version ranges should be reverted, as it's not a common Maven practice

It is perfectly common and I use them after feedback from my downstream users, which may not always get the versions that they want if I do not do that. In fact, in the "typical Maven example" that you link, they say:

These dependencies should be aligned with the ones from the WildFly version we support

So they use these exact dependencies as the result of an upstream decision.

I do not have a strong opinion against using single versions, but the version ranges are self-explanatory about the supported versions and (my) downstream users prefer that.

for the sake of build reproducibility and stability

Maven builds are not reproducible unless you use the reproducible-build-maven-plugin (which I do have in some projects), and even then the result may vary. And if some newer version of a dependency breaks the build, that's something that I'd personally prefer to know sooner (at upstream build/deploy time) rather than later (when downstream hits it).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, "common" was a bad choice of words. Let me state it differently: I very rarely, if ever, come across Maven POMs that use version ranges for dependencies.

From a quick look at some of the most popular Java libraries/frameworks on GitHub that use Maven, I don't find any that use version ranges for dependencies: Maven itself, Guava, Byte Buddy, Spring Data JPA, Hibernate Validator, commons-lang, Quarkus, Helidon, ... (same holds for a few Gradle projects I looked at, such as Spring Framework)

Guava has nearly 197000 dependents. Now I don't know how many of those use Maven, but if the lack of version ranges would really pose a problem for downstream consumers, I'm pretty sure Guava would be using them by now.

By using fixed versions, you always know for each commit which version was used for each dependency: so if it built 2 years ago, it'll still build now. And if 2 developers clone the repo and build any given commit, they're always going to have equivalent results w.r.t. dependencies.

And w.r.t. downstream users: say 2 years ago LibA 2.1 was published, with a dependency on LibB declared as [1.7.1,).
Now if today I declare LibA 2.1 in my POM, and the build fails because LibB is at version 6.5.8, which is incompatible with LibA 2.1, then what do I do? At this point, all I know is: "at least some version >= 1.7.1 that was available when LibA was published, is compatible". And since Maven can resolve very surprising versions, such as alpha/beta releases (there's even an open bug about Maven resolving SNAPSHOT versions for non-SNAPSHOT bounds), it's even possible that LibA 2.1 is not compatible with any released version of LibB.
On the other hand, if LibA would've used fixed versions, things would just work.

As for detecting compatibility issues with new versions of dependencies: I agree that it's better to detect this sooner rather than later, but I believe there are better ways to do so.
If we consider a downstream user again: if htmlparser 1.5 declares XOM [1.3.5,), then 6 months from now, some downstream user might file a bug saying htmlparser 1.5 is not compatible with XOM 1.3.6. And then the burden would be on htmlparser to fix the issue, because it basically declared "any version >= 1.3.5 works". On the other hand, by declaring XOM 1.3.5, a downstream user might still file a bug, but it's clear that htmlparser is not at fault, and never claimed it was compatible with any version > 1.3.5.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the lack of version ranges would really pose a problem for downstream consumers, I'm pretty sure Guava would be using them by now

I was told otherwise by a former Maven developer, although perhaps there was a misunderstanding. And very large projects are much less likely to use dependency ranges.

Maven dependency version resolution is supposed to be somewhat flexible, and I'm not sure under which circumstances the problem would happen, but I was told that the safe approach (in a complex environment where fixed versions of the same dependency were specified by multiple libraries) was to configure exclusions. That may affect very big projects only though, and I haven't really seen a detailed account of it.

all I know is: "at least some version >= 1.7.1 that was available when LibA was published, is compatible"

A dependency range does not give less information than a fixed one. If you know that the minimum version is 1.7.1, that means you can build with exactly that one. Otherwise, the range is wrong.

since Maven can resolve very surprising versions, such as alpha/beta releases

And such a resolution problem would affect the upstream developer, not downstream. If a beta is not compatible, that's a bug that you may have to file.

it's better to detect this sooner rather than later, but I believe there are better ways to do so.

In css4j I have detected some important upstream issues by this method. BTW this requires a significant amount of tests that are able to detect such problems (about 45% of the css4j code is test-related).

then 6 months from now, some downstream user might file a bug saying htmlparser 1.5 is not compatible with XOM 1.3.6

That's the other part of the issue: in my opinion, users have the right expectation that a supported project is going to run with reasonably recent dependencies, and if that is not the case they sometimes may file a bug with the dependency, sometimes with you. I have had a few of those, and have to deal with that as a maintainer.

This project can choose to depend on a fixed version (and some people are going to understand that the library is endorsing that version, or can only work with it), or specify a range (and other people are going to have ridiculous expectations about what that means, as you mention).

The range could be more conservative than what I suggested: for example [1.3.5,2) could be safer for XOM. I specified a range for two of the dependencies (XOM and ICU4J, no plan to do the same with others), and these are unlikely to cause problems in the future. ICU4J numbering in particular is growing fast (currently at 67.1 in Maven Central).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I finally found out what was going on with the fixed dependencies; it appears that for downstream users, setting this:

<version>1.3.5</version>

is effectively the same as:

<version>[1.3.5,)</version>

which is not what I read in:

https://maven.apache.org/pom.html#dependency-version-requirement-specification

"1.0: Soft requirement for 1.0. Use 1.0 if no other version appears earlier in the dependency tree."

nor in:

https://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html#transitive-dependencies

"Maven picks the "nearest definition". That is, it uses the version of the closest dependency to your project in the tree of dependencies. You can always guarantee a version by declaring it explicitly in your project's POM. Note that if two dependency versions are at the same depth in the dependency tree, the first declaration wins."

Nope, the real behaviour is described here:

https://docs.oracle.com/middleware/1212/core/MAVEN/maven_version.htm#CJHDEHAB

"1.0 | It generally means 1.0 or a later version, if 1.0 is not available. Various Maven plug-ins may interpret this differently, so it is safer to use one of the other, more specific options."

That is, Oracle recommends using version ranges...

More from that page:

"When Maven encounters multiple matches for a version reference, it uses the highest matching version. Generally, version references should be only as specific as required so that Maven is free to choose a new version of dependencies where appropriate"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marbaise says on the same page:

If you define the versions in your pom file the dependency tree is always the same which means you don't need to define all transitive dependencies and you saved a lot of work. Except if you use versions ranges in Maven

I hope we can agree that (a) this is not the point of view you're defending here, and (b) given Marbaise's Maven expertise, it's much more likely that his view is in line with best practices.

I am not aware of popular libraries that use ranges, but I have seen ranges used for internal components in application servers, and 'popularity' is something that has to be put always in perspective.

I rest my case. It is beyond me why we're still having this discussion, if it's impossible to readily give me a list of projects that use version ranges and are at least somewhat popular. And just to give some verifiable counterexamples w.r.t. application servers: from a quick look, neither WildFly nor Payara use version ranges.

And yes those people do specify fixed versions for every and each of their dependencies, and are highly proficient.

Which, as mentioned above, is not Marbaise's point of view. And "are highly proficient" is an unverifiable opinion, whereas it's trivial to verify that Marbaise is a Maven expert.

AFAIR they use a commercial 'deployment tool' which could be part of the problem, I don't really know.

I'd say this speaks for itself.

With that being said, I'm out for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With that being said, I'm out for now.

Well, that was before I read your latest comment. Honestly, I'm mindblown by your ability to select and interpret information in such a way that it matches your beliefs.

That page is about Maven in the context of Oracle Fusion Middleware. And yes, in that context, it's reasonable to declare a dependency as [12.1.2,12.1.3), because you don't want to have to bump the dependency version whenever the middleware server is patched. This is about using strict version ranges for a very specific use case in a tightly-controlled environment.

Also note that this is about the use of version ranges in middleware applications, not libraries.

As for the first part of your comment: if you believe it presents a valid point, please describe a concrete example (like: "A has a dependency on B, and B has a dependency on C. If A's dependencies section declares ... and B's dependencies section declares ..., then changing the section(s) as follows ..., will give equivalent results when doing a build of A"), or better yet: set up a repo with 3 folders A/B/C, each containing a Maven project, so I can see for myself what you mean.

So let me try that again: with that being said, I'm out for now.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mr. @anthonyvdotbe is "out" (and hopefully won't come back) and I'm tired of this thread as well, but if anyone wants to know about how those build failures (in complex projects) happen, there is an informative blog post here:

https://ourcraft.wordpress.com/2016/08/22/how-to-read-maven-enforcer-plugins-requireupperbounddeps-rule-failure-report/

My downstream project's build error was of the kind described there, and version ranges are an easy solution to the problem.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To summarize the thread so far:

Given the following dependency graph:

root -> htmlparser X -> xom 1.3.5
     -> libB Y -> xom 1.3.6

Then when root's POM declares:

  1. no xom dependency and
    1. htmlparser is declared first or xom is excluded from libB: PASS with 1.3.5
    2. libB is declared first or xom is excluded from htmlparser: PASS with 1.3.6
  2. xom 1.3.5: PASS with 1.3.5
  3. xom 1.3.6: PASS with 1.3.6

However, when enabling this rule from the maven-enforcer-plugin, case 1.1 and 2 would FAIL instead.

Now @carlosame holds that htmlparser should depend on xom [1.3.5,), since it allows case 1.1 to PASS, even with the rule enabled. However, I hold that version ranges should be avoided, and that the proper solution is for root to modify its POM to match 1.2. or 3 instead.

<scope>compile</scope>
<optional>true</optional>
</dependency>
Expand Down
1 change: 1 addition & 0 deletions src/nu/validator/htmlparser/common/DoctypeExpectation.java
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@
* @version $Id$
* @author hsivonen
*/
@Deprecated
public enum DoctypeExpectation {
/**
* Be a pure HTML5 parser.
Expand Down
4 changes: 3 additions & 1 deletion src/nu/validator/htmlparser/common/DocumentModeHandler.java
Original file line number Diff line number Diff line change
Expand Up @@ -39,8 +39,10 @@ public interface DocumentModeHandler {
* @param mode the document mode
* @param publicIdentifier the public id of the doctype or <code>null</code> if unavailable
* @param systemIdentifier the system id of the doctype or <code>null</code> if unavailable
* @param html4SpecificAdditionalErrorChecks <code>true</code> if HTML 4-specific checks were enabled, <code>false</code> otherwise
* @throws SAXException if things go wrong
*/
public void documentMode(DocumentMode mode, String publicIdentifier, String systemIdentifier) throws SAXException;

@Deprecated
public void documentMode(DocumentMode mode, String publicIdentifier, String systemIdentifier, boolean html4SpecificAdditionalErrorChecks) throws SAXException;
}
9 changes: 8 additions & 1 deletion src/nu/validator/htmlparser/dom/DOMTreeBuilder.java
Original file line number Diff line number Diff line change
Expand Up @@ -248,8 +248,15 @@ protected DOMTreeBuilder(DOMImplementation implementation) {
/**
*
* @see nu.validator.htmlparser.impl.TreeBuilder#documentMode(nu.validator.htmlparser.common.DocumentMode,
* java.lang.String, java.lang.String, boolean)
* java.lang.String, java.lang.String)
*/
protected void documentMode(DocumentMode mode, String publicIdentifier,
String systemIdentifier)
throws SAXException {
document.setUserData("nu.validator.document-mode", mode, null);
}

@Deprecated
protected void documentMode(DocumentMode mode, String publicIdentifier,
String systemIdentifier, boolean html4SpecificAdditionalErrorChecks)
throws SAXException {
Expand Down
4 changes: 4 additions & 0 deletions src/nu/validator/htmlparser/dom/HtmlDocumentBuilder.java
Original file line number Diff line number Diff line change
Expand Up @@ -462,6 +462,7 @@ public void setScriptingEnabled(boolean scriptingEnabled) {
*
* @return the doctypeExpectation
*/
@Deprecated
public DoctypeExpectation getDoctypeExpectation() {
return doctypeExpectation;
}
Expand All @@ -473,6 +474,7 @@ public DoctypeExpectation getDoctypeExpectation() {
* the doctypeExpectation to set
* @see nu.validator.htmlparser.impl.TreeBuilder#setDoctypeExpectation(nu.validator.htmlparser.common.DoctypeExpectation)
*/
@Deprecated
public void setDoctypeExpectation(DoctypeExpectation doctypeExpectation) {
this.doctypeExpectation = doctypeExpectation;
if (treeBuilder != null) {
Expand Down Expand Up @@ -526,6 +528,7 @@ public void setStreamabilityViolationPolicy(
* the name in the value.
* @param html4ModeCompatibleWithXhtml1Schemata
*/
@Deprecated
public void setHtml4ModeCompatibleWithXhtml1Schemata(
boolean html4ModeCompatibleWithXhtml1Schemata) {
this.html4ModeCompatibleWithXhtml1Schemata = html4ModeCompatibleWithXhtml1Schemata;
Expand All @@ -548,6 +551,7 @@ public Locator getDocumentLocator() {
*
* @return the html4ModeCompatibleWithXhtml1Schemata
*/
@Deprecated
public boolean isHtml4ModeCompatibleWithXhtml1Schemata() {
return html4ModeCompatibleWithXhtml1Schemata;
}
Expand Down
1 change: 1 addition & 0 deletions src/nu/validator/htmlparser/impl/Tokenizer.java
Original file line number Diff line number Diff line change
Expand Up @@ -748,6 +748,7 @@ public void setNamePolicy(XmlViolationPolicy namePolicy) {
* @param html4ModeCompatibleWithXhtml1Schemata
* the html4ModeCompatibleWithXhtml1Schemata to set
*/
@Deprecated
public void setHtml4ModeCompatibleWithXhtml1Schemata(
boolean html4ModeCompatibleWithXhtml1Schemata) {
this.html4ModeCompatibleWithXhtml1Schemata = html4ModeCompatibleWithXhtml1Schemata;
Expand Down
1 change: 1 addition & 0 deletions src/nu/validator/htmlparser/impl/TreeBuilder.java
Original file line number Diff line number Diff line change
Expand Up @@ -6027,6 +6027,7 @@ public void setIsSrcdocDocument(boolean isSrcdocDocument) {
* @param doctypeExpectation
* the doctypeExpectation to set
*/
@Deprecated
public void setDoctypeExpectation(DoctypeExpectation doctypeExpectation) {
this.doctypeExpectation = doctypeExpectation;
}
Expand Down
1 change: 1 addition & 0 deletions src/nu/validator/htmlparser/io/Driver.java
Original file line number Diff line number Diff line change
Expand Up @@ -558,6 +558,7 @@ public void setTransitionHandler(TransitionHandler transitionHandler) {
* @param html4ModeCompatibleWithXhtml1Schemata
* @see nu.validator.htmlparser.impl.Tokenizer#setHtml4ModeCompatibleWithXhtml1Schemata(boolean)
*/
@Deprecated
public void setHtml4ModeCompatibleWithXhtml1Schemata(
boolean html4ModeCompatibleWithXhtml1Schemata) {
tokenizer.setHtml4ModeCompatibleWithXhtml1Schemata(html4ModeCompatibleWithXhtml1Schemata);
Expand Down
4 changes: 4 additions & 0 deletions src/nu/validator/htmlparser/sax/HtmlParser.java
Original file line number Diff line number Diff line change
Expand Up @@ -817,6 +817,7 @@ public void setScriptingEnabled(boolean scriptingEnabled) {
*
* @return the doctypeExpectation
*/
@Deprecated
public DoctypeExpectation getDoctypeExpectation() {
return doctypeExpectation;
}
Expand All @@ -828,6 +829,7 @@ public DoctypeExpectation getDoctypeExpectation() {
* the doctypeExpectation to set
* @see nu.validator.htmlparser.impl.TreeBuilder#setDoctypeExpectation(nu.validator.htmlparser.common.DoctypeExpectation)
*/
@Deprecated
public void setDoctypeExpectation(DoctypeExpectation doctypeExpectation) {
this.doctypeExpectation = doctypeExpectation;
if (treeBuilder != null) {
Expand Down Expand Up @@ -881,6 +883,7 @@ public void setStreamabilityViolationPolicy(
* the name in the value.
* @param html4ModeCompatibleWithXhtml1Schemata
*/
@Deprecated
public void setHtml4ModeCompatibleWithXhtml1Schemata(
boolean html4ModeCompatibleWithXhtml1Schemata) {
this.html4ModeCompatibleWithXhtml1Schemata = html4ModeCompatibleWithXhtml1Schemata;
Expand All @@ -903,6 +906,7 @@ public Locator getDocumentLocator() {
*
* @return the html4ModeCompatibleWithXhtml1Schemata
*/
@Deprecated
public boolean isHtml4ModeCompatibleWithXhtml1Schemata() {
return html4ModeCompatibleWithXhtml1Schemata;
}
Expand Down
4 changes: 4 additions & 0 deletions src/nu/validator/htmlparser/xom/HtmlBuilder.java
Original file line number Diff line number Diff line change
Expand Up @@ -499,6 +499,7 @@ public void setScriptingEnabled(boolean scriptingEnabled) {
*
* @return the doctypeExpectation
*/
@Deprecated
public DoctypeExpectation getDoctypeExpectation() {
return doctypeExpectation;
}
Expand All @@ -510,6 +511,7 @@ public DoctypeExpectation getDoctypeExpectation() {
* the doctypeExpectation to set
* @see nu.validator.htmlparser.impl.TreeBuilder#setDoctypeExpectation(nu.validator.htmlparser.common.DoctypeExpectation)
*/
@Deprecated
public void setDoctypeExpectation(DoctypeExpectation doctypeExpectation) {
this.doctypeExpectation = doctypeExpectation;
if (treeBuilder != null) {
Expand Down Expand Up @@ -563,6 +565,7 @@ public void setStreamabilityViolationPolicy(
* the name in the value.
* @param html4ModeCompatibleWithXhtml1Schemata
*/
@Deprecated
public void setHtml4ModeCompatibleWithXhtml1Schemata(
boolean html4ModeCompatibleWithXhtml1Schemata) {
this.html4ModeCompatibleWithXhtml1Schemata = html4ModeCompatibleWithXhtml1Schemata;
Expand All @@ -585,6 +588,7 @@ public Locator getDocumentLocator() {
*
* @return the html4ModeCompatibleWithXhtml1Schemata
*/
@Deprecated
public boolean isHtml4ModeCompatibleWithXhtml1Schemata() {
return html4ModeCompatibleWithXhtml1Schemata;
}
Expand Down
12 changes: 11 additions & 1 deletion src/nu/validator/htmlparser/xom/XOMTreeBuilder.java
Original file line number Diff line number Diff line change
Expand Up @@ -254,9 +254,19 @@ protected void start(boolean fragment) throws SAXException {

/**
* @see nu.validator.htmlparser.impl.TreeBuilder#documentMode(nu.validator.htmlparser.common.DocumentMode,
* java.lang.String, java.lang.String, boolean)
* java.lang.String, java.lang.String)
*/
protected void documentMode(DocumentMode mode, String publicIdentifier,
String systemIdentifier)
throws SAXException {
if (document instanceof Mode) {
Mode modal = (Mode) document;
modal.setMode(mode);
}
}

@Override
@Deprecated
protected void documentMode(DocumentMode mode, String publicIdentifier,
String systemIdentifier, boolean html4SpecificAdditionalErrorChecks)
throws SAXException {
Expand Down