Analysis of Potential Solutions for
Revealing a Host Identifier (HOST_ID) in Shared Address DeploymentsFrance TelecomRennes35000Francemohamed.boucadair@orange.comUSC/ISI4676 Admiralty WayMarina del ReyCA90292-6695United States+1 (310) 448-9151touch@isi.eduFrance TelecomCaen14000Francepierre.levis@orange.comCiscoUnited Statesrepenno@cisco.comINTAREA WGNAT, Host IdentifierThis document is a collection of potential solutions for revealing a host identifier
(denoted as HOST_ID) when a Carrier Grade NAT (CGN) or application proxies are
involved in the path. This host identifier could be used by a remote server to
sort packets according to the sending host. The
host identifier must be unique to each host under the same shared IP
address.This document analyzes a set of potential solutions for revealing a
host identifier and does not recommend a particular solution, although it does
highlight the hazards of some approaches.As reported in , several issues are
encountered when an IP address is shared among several subscribers.
These issues are encountered in various deployment contexts, e.g.,
Carrier-Grade NAT (CGN), application proxies, or Address plus Port (A+P)
. Examples of such issues are: implicit
identification (Section 13.2 of ), spam
(Section 13.3 of ), blacklisting a
misbehaving host (Section 13.1 of ), or
redirecting users with infected machines to a dedicated portal (Section 5.1
of ).In particular, some servers use the source IPv4 address as an
identifier to treat some incoming connections differently. Due to the
deployment of CGNs (e.g., NAT44 , NAT64
), that address will be shared. In
particular, when a server receives packets from the same source address,
because this address is shared, the server does not know which host is
the sending host . The sole use of the
IPv4 address is not sufficient to uniquely distinguish a host.
As a mitigation, it is tempting to investigate ways that would disclose
information to be used by the remote server as a means of uniquely
disambiguating packets sent from hosts using the same IPv4 address.
The risk of not mitigating these issues include: OPEX (Operational
Expenditure) increase for IP connectivity service providers (costs
induced by calls to a hotline), revenue loss for content providers (loss
of users' audience), and customers' dissatisfaction (low quality of
experience, service segregation, etc.).The purpose of this document is to analyze a set of alternative
channels to convey a host identifier and to assess to what extent the
alternatives solve the problem described in . The
evaluation is intended to be comprehensive, regardless of the maturity or
validity of any currently known or proposed solution. The alternatives
analyzed in the document are listed below:Use the Identification field of the IP header (denoted as IP-ID,
).Define a new IP option ().Define a new TCP option ().Inject application headers ().Enable Proxy Protocol ().Assign port sets ().Activate HIP (Host Identity Protocol) ().Use a notification channel ().Use an out-of-band mechanism ().A synthesis is provided in , while the
detailed analysis is elaborated in . discusses privacy issues common to all
proposed solutions. It is out of scope of this document to elaborate on
privacy issues specific to each solution.This document does not include any recommendations because the working
group felt that it was too premature to include one.Policies that rely on source IP addresses and that are enforced by some
servers will be applied to all hosts sharing the same IP address. For
example, blacklisting the IP address of a spammer host will result in
all other hosts that share that address having their access to the
requested service restricted. describes
the issues in detail. Therefore, due to address sharing, servers need
extra information beyond the source IP address to differentiate the
sending host. We call this information the HOST_ID.The HOST_ID identifies a host under a shared IP address. Privacy-related
considerations are discussed in .Within this document, a host can be any computer located behind a
Home Gateway or directly connected to an address-sharing function
located in the network provider's domain (typically this would be the
Home Gateway itself).Because the HOST_ID is used by a remote server to sort out the packets by
sending host, the HOST_ID must be unique to each host under the same shared
IP address, where possible. In the case where only the Home Gateway is
revealed to the operator side of the translation function, the HOST_ID need
only be unique to the Home Gateway. The HOST_ID does not need to be globally
unique. Of course, the combination of the (public) IP source address and
the identifier (i.e., HOST_ID) ends up being unique.If the HOST_ID is conveyed at the IP level, all packets will have to
bear the identifier. If it is conveyed at a higher connection-oriented
level, the identifier is only needed once in the session establishment
phase (for instance, a TCP three-way handshake), then all packets received
in this session will be attributed to the HOST_ID designated during the
session opening.Within this document, we assume the operator-side address-sharing
function injects the HOST_ID. Another deployment option to avoid
potential performance degradation is to let the host or Home Gateway
inject its HOST_ID, but the address-sharing function will check its
content (just like an IP anti-spoofing function). For some proposals,
the HOST_ID is retrieved using an out-of-band mechanism or signaled in a
dedicated notification channel.For A+P and its variants, port set
announcements may be needed as discussed in .Security considerations are common to all analyzed solutions (see
). Privacy-related aspects are discussed
in .The HOST_ID can be ambiguous for hosts with multiple interfaces or
multiple addresses assigned to a single interface. HOST_IDs that are the
same may be used to imply or infer the same end system, but HOST_IDs
that are different should not be used to imply or infer whether the end
systems are the same or different.IP address sharing is motivated by a number of different factors. For
years, many network operators have conserved public IPv4
addresses by making use of Customer Premises Equipment (CPE) that
assigns a single public IPv4 address to all hosts within the customer's
local area network and uses NAT to
translate between locally unique private IPv4 addresses and the CPE's
public address. With the exhaustion of IPv4 address space, address
sharing between customers on a much larger scale is likely to become
much more prevalent. While many individual users are unaware of and
uninvolved in decisions about whether their unique IPv4 addresses get
revealed when they send data via IP, some users realize privacy benefits
associated with IP address sharing, and some may even take steps to
ensure that NAT functionality sits between them and the public Internet.
IP address sharing makes the actions of all users behind the NAT
function unattributable to any single host, creating room for abuse but
also providing some identity protection for non&nbhy;abusive users who wish
to transmit data with reduced risk of being uniquely identified.The proposals considered in this document help
differentiate between hosts that share a public IP address. The extent
of that differentiation depends on what information is included in the
HOST_ID.
The volatility of the HOST_ID information is similar to that of the
internal IP address: a distinct HOST_ID may be used by the
address-sharing function when the host reboots or gets a new internal IP
address. As with persistent IP addresses, persistent HOST_IDs facilitate
user tracking over time.As a general matter, the HOST_ID proposals do not seek to make hosts
any more identifiable than they would be if they were using a public,
non-shared IP address. However, depending on the solution proposal, the
addition of HOST_ID information may allow a device to be fingerprinted
more easily than it otherwise would be. To prevent this, the following
design considerations are to be taken into account:It is
recommended that HOST_IDs be limited to providing local uniqueness
rather than global uniqueness.The address-sharing function
should not use permanent HOST_ID values.Should multiple solutions be combined (e.g., TCP option and Forwarded
header) that include different pieces of information in the HOST_ID,
fingerprinting may become even easier. To prevent this, an
address-sharing function that is able to inject HOST_IDs in several layers
should reveal the same subsets of information at each layer. For
example, if one layer references the lower 16 bits of an IPv4 address, the
other layer should reference these 16 bits too.A HOST_ID can be spoofed, as this is also the case for spoofing an IP
address. Furthermore, users of network-based anonymity services (like
Tor [TOR]) may be capable of stripping HOST_ID information before it reaches
its destination.
In order to control the information revealed to external parties, an
address-sharing function should be able to strip, rewrite, and add
HOST_ID fields.An address-sharing function may be configured to enforce different
end-user preferences with regards to HOST_ID injection. For example,
HOST_ID injection can be disabled for some users. This feature is
policy based and deployment specific.HOST_ID specification document(s) should explain the privacy impact
of the solutions they specify, including the extent of HOST_ID
uniqueness and persistence, assumptions made about the lifetime of the
HOST_ID, whether and how the HOST_ID can be obfuscated or recycled,
whether location information can be exposed, and the impact of the use
of the HOST_ID on device or implementation fingerprinting. provides further
guidance.For more discussion about privacy, refer to .The IPv4 ID (Identification field of IP header, i.e., IP-ID) can
be used to insert information that uniquely distinguishes a host
among those sharing the same IPv4 address. Use of the IP-ID as a
channel to convey the HOST_ID is a theoretical construct (i.e., it is an
undocumented proposal).An address-sharing function can rewrite the IP-ID field to
insert a value that is unique to the host (16 bits are sufficient to
uniquely disambiguate hosts sharing the same IP address). The
address-sharing function injecting the HOST_ID must follow the rules
defined in ; in particular, the same
HOST_ID is not reassigned to another host sharing the same IP
address during a given time interval.A variant of this approach relies upon the format of certain
packets, such as TCP SYN, where the IP-ID can be modified to contain
a 16-bit HOST_ID.Address-sharing devices using this solution would be required to
indicate that they do so, possibly using a special DNS record.This usage is not consistent with the fragment reassembly use of
the Identification field or the
updated handling rules for the Identification field .Complications may arise if the packet is fragmented before
reaching the device that is injecting the HOST_ID. To appropriately handle
those packet fragments, the address-sharing function will need to
maintain a lot of state.Another complication to be encountered is where translation is
balanced among several NATs; setting the appropriate HOST_ID by a
given NAT would alter the coordination between those NATs. Of
course, one can argue that this coordinated NAT scenario is not a typical
deployment scenario; regardless, using the IP-ID as a channel to convey
a HOST_ID is ill-advised.An alternate way to convey the HOST_ID is to define an IP
option . A HOST_ID IP option can be
inserted by the address-sharing function to uniquely distinguish a
host among those sharing the same IP address. An example of such an
option is documented in . This IP
option allows the conveyance of an IPv4 address, an IPv6 prefix, a Generic Routing Encapsulation (GRE) key, an IPv6 Flow Label,
etc.An IP option may also be used as described in Section
4.6 of .This proposal can apply to any transport protocol. However,
it is widely known that routers and other middleboxes filter IP
options (e.g., drop IP packets with unknown IP options, strip
unknown IP options, etc.).Injecting the HOST_ID IP option introduces some implementation
complexity in the following cases: The packet is at or close to the MTU size.The options space is exhausted.Previous studies demonstrated that "IP Options are not an option"
(refer to and ).In conclusion, using an IP option to convey a HOST_ID is not
viable.The HOST_ID may be conveyed in a dedicated TCP option. An example is
specified in . This
option encloses the TCP client's identifier (e.g., the lower 16 bits
of its IPv4 address, its VLAN ID, VRF ID, or subscriber ID). The
address-sharing device inserts this TCP option into the TCP SYN
packet.Using a new TCP option to convey the HOST_ID does not require any
modification to the applications, but it is applicable only for
TCP&nbhy;based applications. Applications relying on other transport
protocols are therefore left unsolved. discusses the
interference with other TCP options.The risk of session failure due to handling a new TCP
option is low as measured in . provides a
detailed implementation and experimentation report of a HOST_ID TCP
option. This document provides an in-depth investigation of the
impact of implementing
HOST_ID on the host, the address-sharing function, and the
enforcement of policies at the server side. It also reports a
failure ratio of 0.103% among the top 100,000 websites.Some downsides have been identified with defining a TCP option to
reveal a host identity:Conveying an IP address in a TCP option may be seen as a
violation of OSI layers, but since IP addresses are already used
for the checksum computation, this is not seen as a blocking
point. Moreover, the updated version of no longer allows
conveyance of a full IP address because the HOST_ID is encoded in 16
bits.TCP option space is limited and might be consumed by the TCP
client. discusses
two approaches to sending the HOST_ID: sending the HOST_ID in
the TCP SYN (which consumes more bytes in the TCP header of the
TCP SYN) and sending the HOST_ID in a TCP ACK (which consumes
only two bytes in the TCP SYN).Content providers may find it more desirable to receive the
HOST_ID in the TCP SYN, as that more closely preserves the
HOST_ID received in the source IP address as per current
practices. Moreover, sending the HOST_ID in the TCP SYN does not
interfere with . In
the ACK mode, if the server is configured to deliver different
data based on HOST_ID, then it would have to wait for the ACK
before transmitting data.HOST_ID mechanisms need to be aware of end-to-end (E2E)
issues and avoid interfering with them. One example of such
interference would be injecting or removing TCP options of
transited packets; another such interference involves
terminating and re-originating TCP connections not belonging to
the transit device. The HOST_ID TCP option handled by the source
node avoids this issue.Injecting the HOST_ID TCP option introduces some
implementation complexity if the options space is exhausted.
Specification document(s) should specify the behavior
of the address-sharing function in detail in such a case.It is more complicated to implement sending the HOST_ID in a
TCP ACK, as it can introduce MTU issues if the ACK packet also
contains TCP data or if a TCP segment is lost. Note that MTU
complications can be experienced if user data is included
in a SYN packet (e.g., ).When there are several NATs in the path, the original HOST_ID
may be lost. The loss of the original HOST_ID may not be a
problem, as the target usage is between proxies or between a CGN and
server. Only the information leaked in the last communication
leg (i.e., between the last address-sharing function and the
server) is likely to be useful.Interference with usages such as a Forwarded HTTP header (see
) should be elaborated to specify the
behavior of servers when both options are used; in particular,
specify which information to use: the content of the TCP option
or what is conveyed in the application headers.When load balancers or proxies are in the path, this option
does not allow the preservation of the original source IP
address and source port. Preserving such information is required
for logging purposes (e.g., ). defines a
TCP option that allows various combinations of source
information (e.g., source port, source port and source IP
address, source IPv6 prefix, etc.) to be revealed.More discussion about issues raised when extending TCP can
be found at .Another option is to not require any change within the transport
or the IP levels but to convey the required information that will be
used to disambiguate hosts at the application payload. The
format of the conveyed information and the related semantics depend
on its application (e.g., HTTP, SIP, SMTP, etc.).Related mechanisms could be developed for other application-layer
protocols, but the discussion in this document is limited to HTTP
and similar protocols.For HTTP, the Forwarded header can be used to
display the original IP address when an address-sharing device is
involved. Service providers operating address-sharing devices can
enable the feature of injecting the Forwarded header, which will
enclose the original IPv4 address or the IPv6 prefix part (see the
example shown in ). The address-sharing
device has to strip all included Forwarded headers before injecting
its own. Servers may rely on the contents of this field to enforce
some policies such as blacklisting misbehaving users.Note that standardizes the Forwarded
header field, to replace the de facto (and not standard) X-Forwarded-For (XFF)
header.Not all applications impacted by address sharing can support the
ability to disclose the original IP address. Only a subset of
protocols (e.g., HTTP) can rely on this solution.For the HTTP case, to prevent users from injecting invalid HOST_IDs,
an initiative has been launched by Wikimedia to maintain a list of
trusted ISPs (Internet Service Providers) using XFF (see the list
available at ). If an
address-sharing device is on the list of trusted XFF ISPs, users
editing Wikimedia located behind the address-sharing device will
appear to be editing from their "original" IP address and not from
the NATed IP address. If an offending activity is detected,
individual hosts can be blacklisted instead of blacklisting all hosts
sharing the same IP address.XFF header injection is a common practice of load balancers. When
a load balancer is in the path, the original content of any included
XFF header should not be stripped. Otherwise, the information about
the "origin" IP address will be lost.When several address-sharing devices are crossed, the Forwarded
header can convey the list of IP addresses (e.g., ). The origin HOST_ID can be exposed to the
target server.Injecting the Forwarded header also introduces some implementation
complexity if the HTTP message is at or close to the MTU size.It has been reported that some HTTP proxy implementations may
encounter parsing issues when injecting an XFF header.Injecting the Forwarded header for all HTTPS traffic is infeasible.
This may be problematic given the current HTTPS usage trends.The solution, referred to as the Proxy Protocol , does not require any application-specific
knowledge. The proposed solution (Proxy Protocol Version 1) would
insert identification data directly into the application-data stream prior to
the actual protocol data being sent, regardless of the protocol. Every
application protocol would begin with a textual string of "PROXY", followed by some textual
identification data, and with a CRLF; only then would the
application data be inserted.
shows an example of a line of data used for this purpose, in this
case, for a TCP-over-IPv4 connection received from 192.0.2.1:56324 and destined
to 192.0.2.15:443.Upon receipt of a message conveying this line, the server removes
the line from the incoming stream. The line is parsed to retrieve the
transported protocol. The content of this line is recorded in logs and used to
enforce policies.Proxy Protocol Version 2 is designed to accommodate IPv4/IPv6 and
also non-TCP protocols (see for more
details).This solution can be deployed in a controlled environment, but it
cannot be deployed to all access services available in the
Internet. If the remote server does not support the Proxy Protocol,
the session will fail. Other complications will arise due to the
presence of firewalls, for instance.As a consequence, this solution is infeasible and cannot be
recommended.This solution does not require any action from the
address-sharing function to disclose a host identifier. Instead of
assuming that all transport ports are associated with one single host,
each host under the same external IP address is assigned a
restricted port set. These port sets are then advertised to remote
servers using offline means. This announcement is not required for
the delivery of internal services (i.e., offered by the service
provider deploying the address-sharing function) relying on implicit
identification.Port sets assigned to hosts may be static or dynamic.Port set announcements to remote servers are not required to
reveal the identity of individual hosts; they are used to advertise
the enforced policy to generate non-overlapping port sets (e.g., the
transport space associated with an IP address is fragmented to
contiguous blocks of 2048 port numbers).Examples of such an approach are documented in and .The solution does not require defining new fields or options; it
is policy based.The solution may contradict the port randomization as identified in . A mitigation would be to avoid assigning
static port sets to individual hosts.The method is convenient for the delivery of services offered by
the service provider that is also offering the Internet access
service. specifies an architecture that
introduces a new namespace to convey identity information.This solution requires both the client and the server to support
HIP . Additional architectural
considerations are to be taken into account, such as the key
exchanges.An alternative deployment model that does not require the
client to be HIP-enabled is having the address-sharing function
behave as a UDP/TCP-HIP relay. This model is also not viable as it
assumes all servers are HIP-enabled.This solution is a theoretical construct (i.e., the proposal is
not documented).Another alternative is to convey the HOST_ID using a separate
notification channel than the one the packets issued to invoke the
service.This solution
relies on a mechanism where the address-sharing function
encapsulates the necessary host-identifying information into an ICMP
Echo Request packet that it sends in parallel with the initial
session creation (e.g., SYN). The information included in the ICMP
Request Data portion describes the five-tuples as seen on both
sides of the address-sharing function. An implementation example is
defined in . This ICMP proposal is valid for any transport protocol that
uses a port number. The address-sharing function may be
configured with the transport protocols that will trigger
issuing those ICMP messages.A hint should be provided to the ultimate server (or
intermediate nodes) that the ICMP Echo Request conveys a
HOST_ID. This may be implemented using magic numbers (a magic
number is a self-selected codepoint whose primary value is its unlikely
collision with values selected by others).
Even if ICMP packets are blocked in the communication path,
the user connection does not have to be impacted.Implementations requiring a session establishment to be delayed until receipt of the companion ICMP Echo Request may
lead to some user-experience degradation.Because of the presence of load balancers in the path, the
ultimate server receiving the SYN packet may not be the one
that receives the ICMP message conveying the HOST_ID.Because of the presence of load balancers in the path, the
port number assigned by address sharing may be lost. Therefore,
the mapping information conveyed in the ICMP may not be
sufficient to associate a SYN packet with a received ICMP.The proposal is not compatible with the presence of cascaded
NAT. The main reason is that each NAT in the path will generate an
ICMP message to reveal the internal host identifier. Because
these messages will be translated by the downstream
address-sharing devices, the remote server will receive multiple
ICMP messages and will need to decide which host identifier to
use.The ICMP proposal will add traffic overhead for both the
server and the address-sharing device.The ICMP proposal is similar to other mechanisms (e.g.,
IPFIX and
Syslog )
for reporting dynamic mappings to a mediation platform (mainly
for legal traceability purposes). Performance degradation is
likely to be experienced by address-sharing functions because
ICMP messages are sent for each new instantiated mapping (even if
the mapping exists).In some scenarios (e.g., Section 3 of ), the HOST_ID should
be interpreted by intermediate devices that embed Policy
Enforcement Points (PEP)
responsible for granting access to some services. These PEPs
need to inspect all received packets in order to find the
companion (traffic) messages to be correlated with ICMP messages
conveying HOST_IDs. This induces more complexity to these
intermediate devices.Another alternative is to retrieve the HOST_ID using a dedicated
query channel.An implementation example may rely on the Identification Protocol
(Ident) . This solution assumes that the
address-sharing function implements the server part of IDENT, while
remote servers implement the client part of the protocol. IDENT
needs to be updated
to be
able to return a host identifier instead of the user-id as defined
in . The IDENT response syntax uses
the same USERID field described in ,
but rather than returning a username, a host identifier (e.g., a
16-bit value) is returned.
For any
new incoming connection, the server contacts the IDENT server to
retrieve the associated identifier. During that phase, the
connection may be delayed.IDENT is specific to TCP. Alternative out-of-band mechanisms
may be designed to cover other transport protocols such as
UDP.This solution requires the address-sharing function to embed
an IDENT server.A hint should be provided to the ultimate server (or
intermediate nodes) that the address-sharing function implements
the IDENT protocol, for example, publishing this
capability using DNS (other solutions can be envisaged).An out-of-band mechanism may require some administrative
setup (e.g., contract agreement) between the entity managing the
address-sharing function and the entity managing the remote
server. Such a deployment is not feasible in the Internet at
large because establishing and maintaining agreements between
ISPs and all service actors is burdensome and not scalable.Implementations requiring delay of the establishment of a
session until receipt of the companion IDENT response may lead
to some user-experience degradation.The IDENT proposal will add traffic overhead for both the
server and the address-sharing device.Performance degradation is likely to be experienced by
address-sharing functions embedding the IDENT server. This is
further exacerbated if the address-sharing function has to
handle an IDENT query for each new instantiated mapping (even if
the mapping exists).In some scenarios (e.g., Section 3 of ), the HOST_ID should
be interpreted by intermediate devices that embed Policy
Enforcement Points (PEP)
responsible for granting access to some services. These PEPs
need to inspect all received packets in order to generate the
companion IDENT queries. This may induce more complexity to
these intermediate devices.IDENT queries may be generated by illegitimate TCP servers.
This would require the address-sharing function to enforce some
policies (e.g., rate-limit queries, filter based on the source
IP address, etc.).Table 1 summarizes the approaches analyzed in this
document."Encrypted Traffic" refers to TLS. The use of IPsec and its
complications traversing NATs are discussed in Section 2.2 of . Similar to what is
suggested in Section 13.5 of , HOST_ID
specification document(s) should analyze the compatibility
of each IPsec mode in detail."Success ratio" indicates the ratio of successful communications
with remote servers when the HOST_ID is injected using a proposed
solution. More details are provided below to explain how the success
ratio is computed for each proposed solution."Possible Perf Impact" indicates the level of expected
performance degradation. The indicated degradation is an estimate
based on the need for processing at the IP layer.
"OS TCP/IP Modif" indicates whether a modification of the OS
TCP/IP stack is required at the server side."Deployable today" indicates if the solution can be generalized
without any constraint on current architectures and practices.Provided success ratio figures for TCP and IP options are based on
the results documented in and .The provided success ratio for the IP-ID is theoretical; it assumes the
address-sharing function follows the rules (see )to rewrite the IP Identification field.Since PROXY and HIP are not widely deployed, the success ratio for
establishing communication with remote servers using these protocols
is low.The success ratio for the ICMP-based solution is
implementation-specific but it is likely to be close to 100%. The
success ratio depends on how efficiently the solution is implemented on
the server side. A remote server that does not support the ICMP-based
solution will ignore received companion ICMP messages. An upgraded server
will need to delay the acceptance of a session until it receives the companion
ICMP message.The success ratio for the IDENT solution is implementation specific but
it is likely to be close to 100%. The success ratio depends on how
efficient the solution is implemented on the server side. A remote
server that does not support IDENT will accept a session establishment
request following its normal operation. An upgraded server will need to
delay the acceptance of a session until it receives a response to the IDENT
request it will send to the host.
The provided success ratio for the Port Set and HTTP header solutions is
100% because no additional Layer 3 or Layer 4 field is needed to convey HOST_ID
for these solutions.
If the server trusts the content of the HOST_ID field, a third-party user
can be impacted by a misbehaving user revealing a "faked" HOST_ID (e.g.,
original IP address). This same security concern applies for the injection
of an IP option, TCP option, and application-related content (e.g.,
the Forwarded HTTP header) by the address-sharing device.
The HOST_ID may be used to leak information about the internal structure
of a network behind an address-sharing function. If this behavior is
undesired for the network administrator, the address-sharing function
can be configured to strip any existing HOST_ID in received packets from
internal hosts.HOST_ID specification documents should elaborate further on threats
inherent to each individual solution used to convey the HOST_ID (e.g.,
use of the IP-ID field to count hosts behind a NAT ).For more discussion of privacy issues related to the HOST_ID, see .Many thanks to D. Wing, C. Jacquenet, J. Halpern, B. Haberman, and P. Yee for their review, comments, and inputs.Thanks also to P McCann, T. Tsou, Z. Dong, B. Briscoe, T. Taylor, M. Blanchet, D. Wing, and A. Yourtchenko for the discussions in Prague.Some of the issues related to defining a new TCP option have been
raised by L. Eggert.The privacy text was provided by A. Cooper.Tor: The secondgeneration onion routerRevealing Hosts
Sharing An IP Address Using TCP OptionWhen
an IP address is shared among several subscribers -- with a NAT or with an
application-level proxy -- it is impossible for the server to differentiate
between different clients. Such differentiation is valuable in several
scenarios. This memo describes a technique to differentiate TCP clients
sharing an IP address. The proposed method uses a TCP option, which avoids
altering the application-level payload and works well with SSL-protected
connections.IPFIX
Information Elements for Logging NAT EventsNAT
devices are required to log events like creation and deletion of translations
and information about the resources it is managing. With the wide deployment of
Carrier Grade NAT (CGN) devices, the logging of events have become very
important for legal purposes. The logs are required in many cases to identify
an attacker or a host that was used to launch malicious attacks and/or for
various other purposes of accounting. Since there is no standard way of
logging this information, different NAT devices behave differently and hence it
is difficult to expect a consistent behavior. The lack of a consistent way
makes it difficult to write the collector applications that would receive this
data and process it to present useful information. This document describes the
information that is required to be logged by the NAT
devices.Syslog
Format for NAT LoggingWith
the wide deployment of Carrier Grade NAT (CGN) devices, the logging of
NAT-related events has become very important for legal purposes. The logs may
be required to identify a host that was used to launch malicious attacks or
engage in illegal behaviour, and/or may be required for accounting purposes.
This document identifies the events that need to be logged and the parameters
that are required in the logs depending on the context in which the NAT is
being used. It goes on to standardize formats for reporting these events and
parameters using SYSLOG (RFC 5424). A companion document specifies formats for
reporting the same events and parameters using IPFIX (RFC 5101). Applicability
statements are provided in this document and its companion to guide operators
and implementors in their choice of which technology to use for
logging.Forwarded
HTTP ExtensionThis
document defines an HTTP extension header field that allows proxy components to
disclose information lost in the proxying process, for example, the originating
IP address of a request or IP address of the proxy on the user-agent-facing
interface. In a path of proxying components, this makes it possible to arrange
it so that each subsequent component will have access to, for example, all IP
addresses used in the chain of proxied HTTP requests. This document also
specifies guidelines for a proxy administrator to anonymize the origin of a
request.Privacy
Considerations for Internet ProtocolsThis
document offers guidance for developing privacy considerations for inclusion in
IETF documents and aims to make protocol designers aware of privacy-related
design choices. Discussion of this document is taking place on the IETF
Privacy Discussion mailing list (see
https://www.ietf.org/mailman/listinfo/ietf-privacy).Revealing
Hosts Sharing An IP Address Using ICMP Echo RequestWhen
an IP address is shared among several subscribers -- with a NAT or with an
application-level proxy -- it is impossible for the server to differentiate
between different clients. Such differentiation is valuable in several
scenarios. This memo describes a technique to differentiate TCP and UDP
clients sharing an IP address. The proposed method uses an ICMP Echo Request
packet, which allows for more information about the user mapping to be
transmitted than in the case of using the TCP option - and allows the use with
UDP and other protocols.IPv4
Header Option For User Identification In CGN ScenarioIn
some application scenarios, it is necessary to be able to identify an user when
CGN is deployed. This document defines a new IPv4 header option for host
identification, which contains NAT mapping information, e.g. the internal
source IP address before translation. Each time a NAT device performs
translation on an IP packet, NAT mapping information will be added in the IP
header. With the NAT mapping information, it will be easy to identify a
host.Using PCP to
Reveal a Host behind NATThis
document describes how to use PCP to retrieve the identify of a host behind a
NAT. Two use cases are discussed and the PCP applicability is analyzed. This
document extends PCP with a new OpCode: QUERY. The proposed mechanism is valid
for all NAT flavors including NAT44, NAT64 or
NPTv6.TCP Fast
OpenTCP
Fast Open (TFO) allows data to be carried in the SYN and SYN-ACK packets and
consumed by the receiving end during the initial connection handshake, thus
saving up to one full round trip time (RTT) compared to standard TCP which
requires a three-way handshake (3WHS) to complete before data can be
exchanged. TerminologyDeterministic
Address Mapping to Reduce Logging in Carrier Grade NAT
DeploymentsIn
some instances, Service Providers have a legal logging requirement to be able
to map a subscriber's inside address with the address used on the public
Internet (e.g. for abuse response). Unfortunately, many Carrier Grade NAT
logging solutions require active logging of dynamic translations. Carrier
Grade NAT port assignments are often per-connection, but could optionally use
port ranges. Research indicates that per-connection logging is not scalable in
many residential broadband services. This document suggests a way to manage
Carrier Grade NAT translations in such a way as to significantly reduce the
amount of logging required while providing traceability for abuse response.
While the authors acknowledge that IPv6 is a preferred solution, Carrier Grade
NAT is a reality in many networks, and is needed in situations where either
customer equipment or Internet content only supports IPv4; this approach should
in no way slow the deployment of IPv6.HOST_ID
TCP Options: Implementation & Preliminary Test ResultsThis
memo documents the implementation of the HOST_ID TCP Options. It also discusses
the preliminary results of the tests that have been conducted to assess the
technical feasibility of the approach as well as its scalability. Several
HOST_ID TCP options have been implemented and
tested.IP Options Are Not An OptionR. Fonseca, G. Porter, R. Katz, S. Shenker, and I.
Stoica,Measuring Interactions Between Transport Protocols and
MiddleboxesMedina, A, Allman, M. and S. FloydIs It Still Possible to Extend TCP?Honda, M., Nishida, Y., Raiciu, C., Greenhalgh, A.,
Handley, M. and H. Tokuda,Trusted XFF ListA Technique for Counting NATted HostsThe PROXY protocol