Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 30 additions & 1 deletion api/envoy/config/route/v3/route.proto
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ package envoy.config.route.v3;
import "envoy/config/core/v3/base.proto";
import "envoy/config/core/v3/config_source.proto";
import "envoy/config/route/v3/route_components.proto";
import "envoy/type/matcher/v3/regex.proto";

import "google/protobuf/any.proto";
import "google/protobuf/wrappers.proto";
Expand All @@ -23,7 +24,7 @@ option (udpa.annotations.file_status).package_version_status = ACTIVE;
// * Routing :ref:`architecture overview <arch_overview_http_routing>`
// * HTTP :ref:`router filter <config_http_filters_router>`

// [#next-free-field: 18]
// [#next-free-field: 19]
message RouteConfiguration {
option (udpa.annotations.versioning).previous_message_type = "envoy.api.v2.RouteConfiguration";

Expand Down Expand Up @@ -155,6 +156,34 @@ message RouteConfiguration {
// For instance, if the metadata is intended for the Router filter,
// the filter name should be specified as ``envoy.filters.http.router``.
core.v3.Metadata metadata = 17;

// The host simplification rules are a set of regex substitutions
// that can modify the :authority used when matching
// VirtualHosts. It will not change what is sent upstream. This can
// be used to implement multiple-wildcard matching, by converting
// all but one of the wildcards into a static string.
// This is similar to ignore_port_in_host_matching (above), but more flexible.
// To use this, at least one simplification rule must be configured,
// and then a
// :ref:`envoy_v3_api_msg_config.route.v3.VirtualHost`.domains field
// must be set to match the results of the simplification rule.
// An example may help:
// > host_simplification_rules:
// > - pattern:
// > regex: "^(foo)[.]([^.]+)[.](example[.]org)$"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember seeing some discussions about the disadvantages of using general-regex matching (for non-prefix/non-suffix) use-cases, but I cannot recall the details, and whether it was related to host-matching.
cc @yanavlasov @mattklein123 may have more context.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw a proposal a while back about using domain_regex as a list in virtual hosts; this is arguably somewhat similar, but hopefully done in a way that the host matching algorithm itself can be changed to use alternate structures (a trie comes to mind, for example), where a list of regexs in the virtualhosts doesn't allow that.

// > substitution: \1.bar.\3
// will allow a HTTP request with an :authority header of
// 'foo.anything.example.org' or 'foo.something.example.org' to both
// be matched by a VirtualHost with a domain entry of
// 'foo.bar.example.org', due to the second label in the domain
// being replaced by the simplification rule to 'bar'.
// If multiple rules are provided, they are processed in order. The
// results of the first rule will be used by the second rule, and so
// on. It is unlikely that you want to depend on this behavior,
// however, due to the potential for confusion. It is recommended
// that, if you need multiple simplification rules, they should be
// as independent of each other as possible.
repeated type.matcher.v3.RegexMatchAndSubstitute host_simplification_rules = 18;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could benefit from an example I think. regexes are pretty hard to understand (for most people), and we need to give some guidance.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's fair. I wrote more words, though I'm unsure on how the formatting will work out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also slightly improved the example from the one I used in the tests by putting the . outside the capture groups, so it would be more obvious what the substitution itself was doing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will happens if there are multiple rules? Will the final result will be used to find vhost or every rule's result will be used one by one?

And at least give it a meaning for name like host_rewrite_for_matching or something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

final result; I think the updated comments explain that better now, at least.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about simplification_rules_for_host_matching?
I'm wary of host_rewrite... because it doesn't do the rewriting (intentionally) for the upstream request.... though I suppose these rules don't have to be used for "simplification", either.

}

message Vhds {
Expand Down
6 changes: 6 additions & 0 deletions changelogs/current.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -359,5 +359,11 @@ new_features:
Added a new metric ``db_build_epoch`` to track the build timestamp of the MaxMind geolocation database files.
This can be used to monitor the freshness of the databases currently in use by the filter.
See `MaxMind DB build_epoch <https://maxmind.github.io/MaxMind-DB/#build_epoch>`_ for more details.
- area: http
change: |
Added support for :ref:`host_simplification_rules
<envoy_v3_api_field_config.route.v3.RouteConfiguration.host_simplification_rules>`
to allow for regular expression substitutions to "simplify" a host
before doing virtual host matching.

deprecated:
33 changes: 27 additions & 6 deletions source/common/router/config_impl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1929,6 +1929,17 @@ RouteMatcher::RouteMatcher(const envoy::config::route::v3::RouteConfiguration& r
}
}
}
for (const auto& simplification_rule : route_config.host_simplification_rules()) {
auto result =
Regex::Utility::parseRegex(simplification_rule.pattern(), factory_context.regexEngine());

SET_AND_RETURN_IF_NOT_OK(result.status(), creation_status);

std::unique_ptr<SimplificationRule> rule = std::make_unique<SimplificationRule>(
std::move(*result), simplification_rule.substitution());

host_simplification_rules_.push_back(std::move(rule));
}
}

const VirtualHostImpl* RouteMatcher::findVirtualHost(const Http::RequestHeaderMap& headers) const {
Expand All @@ -1943,34 +1954,44 @@ const VirtualHostImpl* RouteMatcher::findVirtualHost(const Http::RequestHeaderMa
return nullptr;
}

// Lower-case the value of the host header, as hostnames are case insensitive.
std::string host_header_value = absl::AsciiStrToLower(headers.getHostValue());

// If 'ignore_port_in_host_matching' is set, ignore the port number in the host header(if any).
absl::string_view host_header_value = headers.getHostValue();
if (ignorePortInHostMatching()) {
if (const absl::string_view::size_type port_start =
Http::HeaderUtility::getPortStart(host_header_value);
port_start != absl::string_view::npos) {
host_header_value = host_header_value.substr(0, port_start);
}
}

// If any host simplification rules exist, process them in order to
// rewrite the host header used when looking up virtual hosts. (This
// is notionally similar to the handling of
// `ignore_port_in_host_matching`, but more flexible.)
for (const auto& simplifier : host_simplification_rules_) {
host_header_value =
simplifier->matcher->replaceAll(host_header_value, simplifier->substitution);
}

// TODO (@rshriram) Match Origin header in WebSocket
// request with VHost, using wildcard match
// Lower-case the value of the host header, as hostnames are case insensitive.
const std::string host = absl::AsciiStrToLower(host_header_value);
const auto iter = virtual_hosts_.find(host);
const auto iter = virtual_hosts_.find(host_header_value);
if (iter != virtual_hosts_.end()) {
return iter->second.get();
}
if (!wildcard_virtual_host_suffixes_.empty()) {
const VirtualHostImpl* vhost = findWildcardVirtualHost(
host, wildcard_virtual_host_suffixes_,
host_header_value, wildcard_virtual_host_suffixes_,
[](absl::string_view h, int l) -> absl::string_view { return h.substr(h.size() - l); });
if (vhost != nullptr) {
return vhost;
}
}
if (!wildcard_virtual_host_prefixes_.empty()) {
const VirtualHostImpl* vhost = findWildcardVirtualHost(
host, wildcard_virtual_host_prefixes_,
host_header_value, wildcard_virtual_host_prefixes_,
[](absl::string_view h, int l) -> absl::string_view { return h.substr(0, l); });
if (vhost != nullptr) {
return vhost;
Expand Down
9 changes: 9 additions & 0 deletions source/common/router/config_impl.h
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
#include <string>
#include <vector>

#include "envoy/config/common/matcher/v3/matcher.pb.h"
#include "envoy/config/core/v3/base.pb.h"
#include "envoy/config/route/v3/route.pb.h"
#include "envoy/config/route/v3/route_components.pb.h"
Expand Down Expand Up @@ -1255,6 +1256,12 @@ class RouteListMatchActionFactory : public Matcher::ActionFactory<RouteActionCon

DECLARE_FACTORY(RouteListMatchActionFactory);

// Helper structure to keep the matcher and substitution together.
struct SimplificationRule {
const Regex::CompiledMatcherPtr matcher;
const std::string substitution;
};

/**
* Wraps the route configuration which matches an incoming request headers to a backend cluster.
* This is split out mainly to help with unit testing.
Expand Down Expand Up @@ -1303,6 +1310,8 @@ class RouteMatcher {

VirtualHostImplSharedPtr default_virtual_host_;
const bool ignore_port_in_host_matching_{false};

std::vector<std::unique_ptr<SimplificationRule>> host_simplification_rules_;
};

/**
Expand Down
62 changes: 62 additions & 0 deletions test/common/router/config_impl_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -2495,6 +2495,68 @@ TEST_F(RouteMatcherTest, IgnorePortInHostMatching) {
}
}

// Tests that host_simplification_rules mutate/simplify the host used
// for picking the virtualhost when matching
TEST_F(RouteMatcherTest, HostSimplificationRules) {

const std::string yaml = R"EOF(
host_simplification_rules:
- pattern:
regex: "^(foo)[.]([^.]+)[.](example[.]org)$"
substitution: "\\1.bar.\\3"
virtual_hosts:
- name: local_service
domains: ["foo.bar.example.org"]
routes:
- match:
prefix: ""
name: "business-specific-route"
route:
cluster: local_service_grpc
- name: catchall_host
domains:
- "*"
routes:
- match:
prefix: ""
name: "default-route"
route:
cluster: default_catch_all_service
)EOF";
auto route_configuration = parseRouteConfigurationFromYaml(yaml);

factory_context_.cluster_manager_.initializeClusters(
{"local_service_grpc", "default_catch_all_service"}, {});
{
TestConfigImpl config(route_configuration, factory_context_, true, creation_status_);

// First, the trivial, no substitution needed, but should happen anyway:
EXPECT_EQ(config.route(genHeaders("foo.bar.example.org", "/foo", "GET"), 0)->routeName(),
"business-specific-route");

// Matches, but requires the substitution to happen:
EXPECT_EQ(config.route(genHeaders("foo.baz.example.org", "/foo", "GET"), 0)->routeName(),
"business-specific-route");

// Matches, require substitution, longer replaceable section
EXPECT_EQ(
config.route(genHeaders("foo.barbazquxfoobang.example.org", "/foo", "GET"), 0)->routeName(),
"business-specific-route");

// Shouldn't match, but has a related substring:
EXPECT_EQ(config.route(genHeaders("qux.foo.baz.example.org", "/foo", "GET"), 0)->routeName(),
"default-route");

// Shouldn't match (trivial)
EXPECT_EQ(config.route(genHeaders("12.34.56.78:1234", "/foo", "GET"), 0)->routeName(),
"default-route");
EXPECT_EQ(config.route(genHeaders("www.foo.com:8090", "/foo", "GET"), 0)->routeName(),
"default-route");
EXPECT_EQ(config.route(genHeaders("[12:34:56:7890::]:8090", "/foo", "GET"), 0)->routeName(),
"default-route");
}
}

TEST_F(RouteMatcherTest, Priority) {
const std::string yaml = R"EOF(
virtual_hosts:
Expand Down
Loading