Chatrooms within a Centralized
Conferencing (XCON) SystemPolycommary.ietf.barnes@gmail.comNS-Technologieschris@ns-technologies.comEricsson
Hirsalantie 11
Jorvas 02420Finlandsalvatore.loreto@ericsson.comDISPATCH Working GroupThe document "A Framework for Centralized Conferencing" defines a
centralized conference as both signaling and protocol agnostic. The
primary examples within this framework focus on audio and video as the
media types for the session. This document provides an overview of
the mechanisms defined in the centralized conferencing framework that can
be used to support multi-user chat. In addition,
the document describes additional functionality and requirements necessary to
provide feature rich functionality.
A Centralized Conference as defined by the "A Framework for
Centralized Conferencing" (XCON Framework)
is both signaling and protocol agnostic. The primary examples within the
framework focus on audio and video as the media types for the session.
This document provides an overview of the mechanisms and associated framework elements
involved when text is the media for the conference. This
functionality is often referred to as a "multi-user chat" as
it enables a participant to join a chatroom (e.g. hosted by the conference server)
for the exchange of messages between multiple participants. The message can be plain text
or can contain different format for more advanced functionality.
Several existing protocols support this multi-user chat functionality, such as
Extensible Messaging and Presence Protocol (XMPP) ,
and Internet Relay Chat (IRC) defined in and its successors:
,,,
.
In addition, provides
multi-user chat functionality for a purely SIP signaling based solution option
using Message Session Relay Protocol (MSRP) .
The focus of this document is to describe the interface and provide guidelines for the
the support of
existing multi-user chat functionality on a conferencing system based on the XCON framework
using the Conference Control Manipulation Protocol (CCMP)
independent of the specific media type used by the chat client.
The functionality described in this document is not intended to
replace any of the existing chat protocols, nor is it specifying a new chat protocol.
The motivation for this document is to allow clients that use the conferencing framework model
for other media types (e.g. voice/video) to utilize the same conference control mechanisms and
conferencing system to establish, update and delete a chatroom associated with a
conference instance,
independent of the chat protocol.
This approach also allows the conferencing system to provide a natural
interworking point for various
chat protocols - the details of the interworking are outside the scope of this document.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in .
This document reuses the terminology defined in the Centralized Conferencing Framework
and related protocol document .
Additional terminology used in this document:
a Conferencing Client as defined in
that participates in a "chatroom".A virtual space that users figuratively enter in order to participate in real-time,
text-based conferencing with other users.The functionality that allows multiple users to exchange messages in the context of a room or channel,
similar to Internet Relay Chat (IRC).A message sent from one participant directly to
another participant - i.e., not to the chatroom itself
to all participants. provides a general
illustration of chat clients having a direct, 1:1 connection to the
conferencing system. Participants can use the chat clients to join
a room associated with a conference instance and send messages.
The conferencing system receives the messages sent from a
client participating in a conference instance and then distributes them
to the other clients associated with the conference instance, that are also
present in the chatroom.
The approach in this document is to have no impact on the existing chat
protocols, while taking full advantage of the functionality provided by
the centralized conferencing framework.
A basic solution for MSRP based IM chat sessions is documented in
. It uses the concept of an
"MSRP switch" as the centralized component, whose role is very similar
to the Conferencing Server in this document. The solution in
doesn't explicitly take advantage
of the centralized conferencing framework model, as it primarily intends
to make use of the basic SIP conferencing framework to provide the basic
chat functionality. The MSRP based IM chat solution is compatible with
the solution components described in this document, with no impact on
that basic solution proposal. One of the advantages of applying the two
solutions in concert would be to take advantage of the centralized conferencing
framework model for advanced features, such as sidebars and private
conferences, and manipulation of the conference data.
XMPP assumes a decentralized client-server architecture similar to the one
shown in , wherein a client utilizing
XMPP accesses a server and servers can also communicate with each other over
TCP connections, similar to the email network. The XMPP server can provide
as additional functionality the multi-user conferencing services
.
The XMPP multi-user conferencing service is also compatible with the solution
components described in this document
with no impact on the basic solution proposal.
Indeed, the centralized conferencing framework model
is perfectly able to manage the XMPP strong room control model, including the
ability to kick and ban users,
to name room moderators and administrators, to require membership or passwords in order to join the room.
However it is worth noting that the centralized conferencing framework does not encompass
the communication
between servers, as XMPP does. Thus, for the solution proposal in this document,
the XMPP client SHOULD only have a direct connection with the server hosting the chatroom instance, and
federations between
servers SHOULD NOT be allowed.
The multi-user chat protocol operations, such as create, join and delete
can be performed using both
non-signaling specific mechanisms or protocol specific mechanisms, if defined.
Non-signaling specific
mechanisms are defined in the Centralized Conferencing Framework
and related Conference Control Manipulation Protocol (CCMP) document
.
This document provides the details for the non-signaling specific mechanisms using CCMP
with detailed examples provided
in .
Protocol specific mechanisms are defined in other documents such as for SIP in the
SIPPING Conference Framework
and for XMPP in Multi-User Chat .
The privilege to create a chatroom associated with a conference instance can be restricted to
certain users or
can be reserved to an administrator of the conference. The room creation
can be performed using non-signaling mechanism
or protocol specific mechanism if defined. In the case of CCMP, a confRequest message with a
"create" operation is sent by the chat client. A participant can query the conferencing system to discover the list of the chat rooms
associated with a hosted conference instance. In the case of CCMP, a blueprintsRequest message
for the chatrooms supported by a conferencing system or a confsRequest message for the active
chatrooms can be
sent by the chat client.In order to participate in the discussions held in a multi-user chat room,
a participant MUST first enter the room.
A chat client wishing to enter a chatroom associated with a conference instance
MAY use a non-signaling
or protocol specific mechanism if defined. In the case of CCMP, a participant MUST
join a conference instance using the mechanisms which are described in
- e.g., userRequest message with a "create" operation
to be added to a conference instance.
The request to send a message is specific to the chat protocol (e.g., MSRP SEND).
Upon receipt of a request to send a message,
the conferencing system replicates and forwards the message to all other
chat clients that are participants
of the chat room. Depending upon policy, a conferencing system MAY ignore or reject messages,
in which case they are
not distributed to the other
chat room participants.
A participant MAY send a "private message" to a selected participant or a group of participant.
This privilege SHOULD be
allowed for all participants unless local policy dictates otherwise.
A chat client wishing to exit a chat room uses a non-signaling mechanism or protocol
specific mechanism, if defined.
If the chat client is the last to exit, the conferencing system
can be responsible for deleting the room or the originator/owner/moderator
The privilege to delete a chatroom associated with a conference instance
can be restricted to certain users or
can be reserved to an administrator of the conference.
The room deletion can be performed using non-signaling mechanism
or protocol specific mechanism if defined. In the case of CCMP, the client
MUST send a CCMP confRequest message with an operation of "delete".
As highlighted in the overview section, a chat client connecting to
a conferencing system has a 1:1 relationship with the chat signaling
entity, each having a unique protocol specific chat session identifier (chat-ID). When
referring to chat-IDs the document is making reference to the
locally (at conferencing system) generated chat-ID used for
session signaling identification. An important concept in
this proposal is the creation and management of Group Chat sessions. It is
important that each chat session created, as identified by a unique chat-ID,
is explicitly tied to an associated conference.
The Centralized Conferencing Framework
introduces the concept of a
conference user identifier which is defined in
. When a user joins a
conference instance through the signaling protocol, the user is allocated an
appropriate conference user identifier either through authentication or
system allocation. The conference user identifier is
represented by the 'entity' attribute of a <user> element in the <users> element
in the conference information.
The association of the chat-IDs is
accomplished by including each of the chat-IDs
in the conference information in the 'entity' attribute of an <endpoint> element
in the <user> element.
The conference information as a whole is uniquely
identified within the conferencing system by an XCON-URI, thus providing the relevant
association between a chat session and a centralized conference.
shows the logical repesentation of
the chat-IDs with the
conf-userIDs, with each row in the table
representing a single entry.
A more complex session association is necessary due to potential
for a user to have multiple group chats in a single conference
instance, such as multi-lingual conference support. In an example with
SIP and MSRP, the conference representation in allows for such functionality
when separate SIP dialogs represent MSRP sessions. This process
becomes complex in the case that multiple SDP MSRP media sessions (m=) are defined
in a single payload. This internal representation needs expanding
to enable a conferencing system to explicitly associate a media
session (m=). This involves including the media label, as defined in
, to maintain the internal conference
association. An example is illustrated in .
In , conference user
identifiers '0283hHu' and 'pakdj7H' appear twice. The combination of
multiple conference user identifiers and a unique chat-ID
enables the conference system to clearly identify a specific Group Chat
instance. Even in the simplest conferencing system, where
users are allowed to enter anonymously, the internal representation
described in this section should be observed. In this case, the
conferencing system would still internally create a conference user
identifier for participant reference purposes.
Advanced chat
features, such as sidebars and private messages can also be supported
within the context of
the centralized conferencing framework using CCMP.
Additional protocol details for
these advanced features are provided in .
This section discusses additional operations or features required to
provide chat room functionality. Most of the operations are not
explicitly defined in the centralized conferencing framework. While
most of the features and operations are achievable using the XCON data model
and data maintained
by a conferencing system per the XCON framework, some advanced features
require extensions to the XCON data model
and may be optimized with more discrete CCMP messages.
Nicknames allow a user to define a text string that uniquely
identifies the user within a particular chatroom without necessarily
reflecting any protocol specific identity (e.g., SIP URI, Conference
User Identifier, etc.). It is also important to note that the functionality
to provide nicknames is not limited to users involved in chatrooms,
thus it should be a general feature of the conferencing system.
Within a conferencing system, all nicknames MUST map to a
conference user identifier. The nicknames are unique only to
the specific conferencing system.
To ensure uniqueness of nicknames, any new
'nickname' created MUST be compared with nicknames
already in use or reserved following the rules defined in Preparation
and Comparison of Nicknames . There may be multiple nicknames associated
with a single conference user identifier (e.g., a user that has
different nicknames for different chat rooms and/or voice/video
conferences). In order to support nicknames, a 'nickname' attribute is defined in the
XCON data model within the <user> element. A 'nickname'
can be assigned to the conference
user when an XCON-USERID is assigned to the user.
The conferencing client MAY include a preferred nickname in the CCMP userRequest with a
"create" operation.
The conferencing system allocates a conference user identifier
and a nickname using system specific mechanisms, which can also include
authentication. The conferencing system MUST associate the assigned
nickname with the specific
conference user identifier that has been allocated by updating the conference information.
Another option would be to define a new CCMP message to just manipulate the
'nickname' element, but that is not necessary.
As described , the XCON-userid identifier is used in
conjunction with a chat-ID to internally represent a
participant in a conference instance. This association is created when a
conferencing client requests to create or join a specific chatroom. The nickname
allocated for the specific conferencing user identifier MUST also be associated with
the chat session ID.
provides an example of the association between the
chat session identifier, the conference user identifier and conference nickname
for a specific Group Chat represented by the conference identifier.
Depending
upon the conferencing system, the conference system either allocates the
preferred nickname for that user or
allocates a different nickname. The nickname MUST be included
in the CCMP userResponse message.
In the future, if a more generic nickname mechanism is available, rather than
provide nicknames that
are specific to the conferencing system, a conferencing system may
interface with a nickname registry, for
example, in order to allocate a new nickname for a specific conferencing client.
This change in how
a conferencing system allocates nicknames should not impact the
CCMP protocol interface to support nicknames.
A common chat feature involves logging the history of a chat room.
This provides a record of a chat room that can be used when a user
first joins a chat room as discussed in .
It can also be used to provide a complete capture of a specific chat room session.
When a participant enters a room in which the discussions are logged, the conferencing system
MUST warn the participant that the discussions are logged.
The centralized conferencing framework does not fully describe the role of recording
or logging of active conferences. However, this functionality can be
realized with the manipulation of the appropriate elements in
the data model using the general conference control protocol
operations. One approach for implementing this function would be to
have it be based on specific manipulation of the conference by a user
with the appropriate permissions
(i.e., confRequest messaage with an "update" operation to start and stop recording).
Another mechanism for implementing this function would be to have a specific
user as part of the conference to perform this function, and having the media proxied to a
logging device. In the case of systems that support the Media Control archictecture
and
SIP Control Framework
along with the specific Mixer control package
,
the addition of a user to proxy the media for recording is described in
section 6.2.3 in
A common chat feature allows users to view the past history of chat
rooms. This operation is common when a user first joins a chat room
that is underway. A user is often offered the option to review a
specific number of past messages.
Conferencing systems that maintain
the history associated with specific chat rooms through logging, as
described in , should provide a
mechanism, using the conference identifier, to access the specific
information requested by a user based on a specific timestamp.
The user request for the information
and the rendering of the information is specific to the user's session
based messaging protocol and may not be supported by all the messaging
protocols.
Another chat room feature provides the details of an alternate chat
room venue for previously active chat rooms that have been closed,
with a related topic. While not detailed in the centralized
conferencing framework, this functionality can be accomplished by
creating the new chat room as a child or sibling of the previous chat
room and providing the Active chat conference object identifier to any
valid users that attempt to join a previous chat room. The information
about the new chat room can also be provided at the end of a chat room
that is being de-activated at the end of the session.
The ability to send files to a selected participant or group of participants
is another common functionality, supported by messaging protocols.
This functionality also enables the exchange of information (e.g. name, size, and date)
about the file to be transferred and usually provides a mechanism to show an
image thumbnail for files such as photos. This capability could be reflected in the
conference data (in the conference instance) and requires at least an extension to the
"available-media" element. The thumbnail rendering of the image is outside the scope of
the data model and is specific to the client application. Additional functionality to support
this capability requires further study.
The messaging protocols can be used, and are being used,
in applications quite different from a simple exchange of
text messages between two participants in the context of a chatroom.
Real-time collaboration tools (e.g. Whiteboarding,
screen-share, co-browse or document-share) are some of these applications.
The Conferencing Systems are usually bound to Real-time collaboration tools
to increase the productivity of distributed teams. In terms of correlating this
functionality with CCMP, the mechanisms for
manipulating the conference are the same in terms of updating the data associated
with the conference
with the additional attributes to reflect the multiple sources of media for the chatroom.
This capability could be reflected in the
conference data (in the conference instance) with an extension to the
"available-media" element. Some current systems using SIP embed the attributes
in the media stream. Overall, supporting this functionality requires further study,
in particular with regards to the RTCWeb initiative as described in documents such
as
As discussed in the Centralized Conferencing Framework, there are a
wide variety of potential attacks related to conferencing, due to the
natural involvement of multiple endpoints and the many, often
user-invoked, capabilities provided by the conferencing system. Examples
of attacks associated with chatrooms includes the
following: an endpoint attempting to receive the messages for
conferences in which it is not authorized to participate, an endpoint
attempting to disconnect other users, and theft of service, by an
endpoint, in attempting to create conferences it is not allowed to
create.
Since this document describes the use of existing protocols (i.e.,
MSRP/SIP, CCMP, XMPP, etc.), it depends on the
security solutions for those protocols and the associated authorization
mechanisms. This solution is based on the Centralized
Conferencing framework and makes use of the policy associated with the
conference object to ensure that only authorized entities are able to
manipulate the data to access the capabilities. This solution also makes
use of the privacy and security of the identity of a user in the
conference, as discussed in the Centralized Conferencing Framework.
This document requires no IANA registrations.
The authors appreciate the input and comments from Miguel
Garcia-Martin and Dave Morgan.
Multi-User Chatstpeter@jabber.org