SIMPLE Made Simple: An Overview of the IETF Specifications
for Instant Messaging and Presence Using the Session
Initiation Protocol (SIP)
jdrosen.net
jdrosen@jdrosen.net
http://www.jdrosen.net
RAI
SIMPLE
SIP
SIMPLE
presence
IM
The IETF has produced many specifications related to
Presence and Instant Messaging with the Session Initiation
Protocol (SIP). Collectively, these specifications are
known as SIP for Instant Messaging and Presence
Leveraging Extensions (SIMPLE). This document serves as a guide to
the SIMPLE suite of specifications. It categorizes the
specifications, explains what each is for,
and how they relate to each other.
The IETF has produced many specifications related to Presence and
Instant Messaging with the Session Initiation Protocol
(SIP) . Collectively, these specifications are
known as SIP for Instant Messaging and Presence Leveraging
Extensions (SIMPLE). These specifications cover topics ranging from protocols
for subscription and publication to presence document formats to
protocols for managing privacy preferences. The large number of
specifications can make it hard to figure out exactly what SIMPLE is, what specifications cover it, what functionality it
provides, and how these specifications relate to each other.
This document serves to address these problems. It provides an
enumeration of the protocols that make up the SIMPLE suite of
specifications from IETF. It categorizes them into related areas of
functionality, briefly explains the purpose of each, and how the
specifications relate to each other. Each
specification also includes a letter that designates its category . These values are:
Standards Track
Experimental
Best Current Practice
Informational
SIMPLE provides for both presence and instant messaging (IM) capabilities. Though both
of these fit underneath the broad SIMPLE umbrella, they are well
separated from each other and are supported by different sets of
specifications. That is a key part of the SIMPLE story; presence is
much broader than just IM, and it enables communications using voice
and video along with IM.
The SIMPLE presence specifications can be broken up into:
The core
protocol machinery, which provides the actual SIP extensions for
subscriptions, notifications, and publications
Presence documents, which are XML documents that provide for rich
presence and are carried by the core protocol machinery
Privacy and policy, which are documents for expressing privacy
preferences about how those presence documents are to be shown (or not
shown) to other users
Provisioning, which describes how users manage their privacy policies,
buddy lists, and other pieces of information required for SIMPLE
presence to work
Optimizations, which are improvements in the core protocol machinery
that were defined to improve the performance of SIMPLE, particularly
on wireless links
defines the SUBSCRIBE and NOTIFY
methods for SIP, forming the core of the SIP event notification
framework. To actually use the framework, extensions need to be
defined for specific event packages. Presence is defined as an event
package within this framework. Packages exist
for other, non-presence related functions, such as message waiting
indicators and dialog state changes.
defines an event package for indicating user
presence through SIP. Through this package, a SIP user agent (UA) can ask
to be notified of the presence state of a presentity (presence
entity). The contents of the NOTIFY messages in this package are
presence documents discussed in .
defines an
extension to (which has now been obsoleted by RFC 6665) that allows a client to subscribe to a list of
resources using a single subscription. The server, called a Resource
List Server (RLS), will "expand" the subscription and subscribe to each
individual member of the list. Its primary usage with presence is to
allow subscriptions to "buddy lists". Without RFC 4662, a UA would
need to subscribe to each presentity individually. With RFC 4662, they
can have a single subscription to all buddies. A user can manage the
entries in their buddy list using the provisioning mechanisms in .
is very similar to
RFC 4662. It allows a client to subscribe to a list of
resources using a single subscription. However, with this
mechanism, the list is included within the body of the
SUBSCRIBE request. In RFC 4662, it is provisioned ahead
of time on the server.
defines the PUBLISH method. With
this method, a UA can publish its current state for any event
package, including the presence event package. Once an agent publishes
its presence state, the presence server would send notifications of
this state change using RFC 3856.
Once a user has generated a subscription to presence using the core
protocol machinery, they will receive notifications (SIP NOTIFY
requests) that contain presence information. That presence
information is in the form of an XML presence document. Several
specifications have been defined to describe this document format,
focusing on rich, multimedia presence.
defines the baseline XML format for a
presence document. It defines the concept of a tuple as representing a
basic communication modality and defines a simple status for it (open
or closed).
extends the basic model in RFC
3863. It introduces the concepts of device and person status and
explains how these relate to each other. It describes how presence
documents are used to represent communications systems states in a
consistent fashion. More than RFC 3863, it defines what a presence
document is and what it means.
adds many more attributes to
the presence document schema, building upon the model in RFC 4479. It
allows for indications of activities, moods, places and place types,
icons, and indications of whether or not a user is idle.
adds attributes to the presence document
schema, again building upon the model in RFC 4479. It allows documents
to indicate status for the future or the past. For example, a user
can indicate that they will be unavailable for voice communications
from 2 p.m. to 3 p.m. due to a meeting.
adds attributes to the presence document schema for contact
information, such as a vCard, display name, homepage, icon, or sound
(such as the pronunciation of their name).
adds even
more attributes to the presence document schema, this time to allow
indication of capabilities for the user agent. For example, the
extensions can indicate whether a UA supports audio and video, what
SIP methods it supports, and so on.
The rich presence capabilities defined by the specifications in introduces a strong need for privacy
preferences. Users must be able to approve or deny subscriptions to
their presence and indicate what information such watchers can
see. In SIMPLE, this is accomplished through policy documents
uploaded to the presence server using the provisioning mechanisms in
.
defines a general XML framework for expressing privacy preferences for
both geolocation information and presence information. It introduces
the concepts of conditions, actions, and transformations that are
applied to privacy-sensitive data. The common policy framework
provides privacy safety, a property by which network error or version
incompatibilities can never cause more information to be revealed to a
watcher than the user would otherwise desire.
uses the
framework of RFC 4745 to define a policy document format for
describing presence-privacy policies. Besides basic yes/no approvals,
this format allows a user to control what kind of information a
watcher is allowed to see.
, also known as watcherinfo,
provides a mechanism for a user agent to find out what subscriptions
are in place for a particular event package. Though it was defined to
be used for any event package, it has particular applicability for
presence. It is used to provide reactive authorization. With reactive
authorization, a user gets alerted if someone tries to subscribe to
their presence, so that they may provide an authorization
decision. Watcherinfo is used to provide the alert that someone has
subscribed to a user's presence.
is the
companion to RFC 3857. It specifies the XML format
of watcherinfo that is carried in notifications for the event template
package in RFC 3857.
Proper operation of a SIMPLE presence system requires that several
pieces of data are correctly managed by the users and provisioned into
the system. These include buddy lists (used by the resource list
subscription mechanism in RFC 4662) and privacy policies (such as
those described by the XML format in ).
In SIMPLE, management of this data is handled by the Extensible Markup Language (XML) Configuration Access Protocol (XCAP) . XCAP is used by the
user agent to manipulate buddy lists, privacy policy, and other data
that is represented by XML documents stored on a server.
specifies XCAP, a usage of
HTTP that allows a user agent to manipulate the contents of XML
documents stored on a server. It can be used to manipulate any kind of
XML, and the protocol itself is independent of the particular schema
of the data it is modifying. XML schemas have been defined for buddy
lists, privacy policies, and offline presence status, allowing all of
those to be managed by a user with XCAP.
defines an extension to the
SIP user agent configuration profile, allowing a user agent to learn
about changes in its documents on an XCAP server. With this mechanism,
there can be a change made by someone else to a buddy list or privacy
policy document, and a UA will find out that a new version is
available.
defines an XML format for indicating changes in XCAP
documents. It makes use of an XML diff format defined in . It is used in conjunction
with to alert a user agent
of changes made by someone else to their provisioned data.
defines two XML document
formats used to represent buddy lists. One is simply a list of users
(or more generally, resources), and the other defines a buddy list
whose membership is composed of a list of users or resources. These
lists can be manipulated by XCAP, allowing a user to add or remove
members from their buddy lists. The buddy list is also accessed by the
resource list server specified in RFC 4662 for processing resource
list subscriptions.
defines an XCAP usage that allows a user to store an "offline"
presence document. This is a presence status that is used by
a presence server when there are no presence documents published for
that user by any user agents currently running.
Federation refers to the interconnection of different presence and instant
messaging systems for the purposes of communications. Federation can
be between domains or within a domain. A document has been developed
that describes how presence and IM federation works.
describes a basic set of presence and instant messaging use cases
for federating between providers.
When running over wireless links, presence can be a very expensive
service. Notifications often get sent when the change is not really
relevant to the watcher. Furthermore, when a notification is sent, it
contains the full presence state of the watcher, rather than just an
indication of what changed. Optimizations have been defined to address
both of these cases.
defines a
mechanism that allows a watcher to include filters in its
subscription. These filters limit the cases in which notifications are
sent. It is used in conjunction with RFC 4661,
which specifies the XML format of the filters themselves. The
mechanism, though targeted for presence, can be applied to any SIP
event package.
defines an
XML format used with the event notification
filtering mechanism defined in RFC 4660 .
defines a new XML
format for representing changes in presence documents, called a
partial PIDF document. This format
contains an XML patch operation that, when applied to the
previous presence document, yields the new presence document. The
partial PIDF document is included in presence notifications when a
watcher indicates that they support the format.
defines a mechanism for receiving notifications that
contain partial presence documents.
defines a mechanism for publishing presence status using a partial
PIDF document.
defines an XML structure for representing changes in XML documents. It
is a form of "diff" but specifically for XML documents. It is used by
several of the optimization mechanisms defined for SIMPLE.
defines a
dictionary for usage with Signaling Compression (Sigcomp)
to improve the compressibility of presence
documents.
specifies mechanisms for
adjusting the rate of SIP event notifications. These mechanisms can
be applied in subscriptions to all SIP event packages.
SIMPLE defines two modes of instant messaging. These are page mode and
session mode. In page mode, instant messages are sent by sending a SIP
request that contains the contents of the instant message. In session
mode, IM is viewed as another media type -- along with audio and video
-- and an INVITE request is used to set up a session that includes IM
as a media type. While page mode is more efficient for one or two
message conversations, session mode is more efficient for longer
conversations since the messages are not sent through the SIP
servers. Furthermore, by viewing IM as a media type, all of the
features available in SIP signaling -- third party call control,
forking, and so on, are available for IM.
introduces the MESSAGE method, which can be used to send an instant
message through SIP signaling.
defines a mechanism
whereby a client can send a single SIP MESSAGE to multiple
recipients. This is accomplished by including the list of recipients
as an object in the body and having a network server send a copy to
each recipient.
defines a small text-based protocol for exchanging arbitrarily sized
content of any kind between users. An MSRP session is set up by
exchanging certain information, such as an MSRP URI, within SIP and
Session Description Protocol (SDP) signaling.
defines a
wrapper around instant message content providing metadata, such as
the sender and recipient identity. The CPIM format is carried in
MSRP.
adds support for relays to MSRP. These relay servers
receive MSRP messages and send them towards the destination. They
provide support for firewall and NAT traversal and allow for features
such as recording and inspection to be implemented.
allows clients to negotiate which
endpoint in a session will establish the MSRP connection. Without this
specification, the client generating the SDP offer would initiate the
connection.
allows
middleboxes to anchor the MSRP connection, without the need for
middleboxes to modify the MSRP messages; thus, it also enables a
secure end-to-end MSRP communication in networks where such
middleboxes are deployed.
In SIMPLE, IM multi-user chat (also known as chat-rooms) are provided
using regular SIP conferencing mechanisms. The framework for SIP
conferencing and conference control describe how all SIP-based
conferencing works; including joining and leaving, persistent and
temporary conferences, floor control and moderation, and learning of
conference membership, amongst other functions. All that is necessary
are extensions to provide features that are specific to IM.
defines how MSRP is used
to provide support for nicknames and private chat within an IM
conference.
Several specifications have been written to provide IM-specific
features for SIMPLE. These include "is-typing" indications, allowing a
user to know when their messaging peer is composing a response and
allowing a user to know when their IM has been received via delivery notifications.
defines an XML
format that can be sent in instant messages that indicates the status
of message composition. This provides the familiar "is-typing"
indication in IM systems, but also supports voice, video, and other
message types.
provides delivery
notifications of IM receipt. This allows a user to know with certainty
that a message has been received.
This specification is an overview of existing specifications and does
not introduce any security considerations on its own.
Thanks to Vijay Gurbani, Barry Leiba, Stephen Hanna, and Salvatore
Loreto for their review and comments.
Multi-party Chat Using the Message Session Relay Protocol (MSRP)
The Message Session Relay Protocol (MSRP) defines a mechanism for sending instant messages within a peer-to-peer session, negotiated using the Session Initiation Protocol (SIP) and the Session Description Protocol (SDP). This document defines the necessary tools for establishing multi-party chat sessions, or chat rooms, using MSRP.