Presence Interdomain Scaling Analysis for SIP/SIMPLE

The document analyzes the SIP protocol for presence (AKA SIMPLE but SIMPLE is not a different protocol then SIP but the name of the working group). It analyses the traffic that is generated due to presence subscriptions between domains. It is shown that the number of messages and the amount of data can be extremely big. In addition to the very large traffic the document also analysis the affects of a large presence system on the memory footprint and the CPU load. Current approved and in work optimizations to the SIP protocol are analyzed with the possible impact on the load. Another document provides requirements for optimizations while other documents contain suggestions for new optimizations: . This document is intended to be drive work on possible solutions that will make the deployment of a SIP based presence server less challenging task. Deployment of highly scalable presence systems is challenging by its nature and each protocol developers design their own technique for optimizing their protocol. This document does not try to compare between protocols and it is behind the scope of this document. The document discusses the following areas. In each area we try to show the complexity and the load that the presence server has to handle in order to provide its service. Messages load - By computing the number of messages that are required for connecting presence systems the document shows that the number of messages is very big and it is quite obvious that some optimizations are needed. In addition we also show that the bandwidth required is also very big. State management - Due to the nature of the service that the presence server provides, the presence server has to manage a relatively big and complex state and some computations are provided in the document. Processing complexities - The presence server maintains many small objects and has to do frequent operations on these objects. We show that these operations and especially the optimizations that are intended to save on the amount of data that is being sent between watchers and presence servers, are not so simple and may create a very heavy processing load on the presence server. Groups - Resource List Servers optimize the number of sessions that are created between the watchers and the presence server. On the other hand, this optimization may create an exponential size of subscription due to the unbearable ease of subscribing to large groups. The term presence domain or presence system appears in the document several time. By this term we refer to a SIP based presence server that provides presence subscription and notification services to its users. The system can be a system that is deployed in a small enterprise or in a very large consumer network.

Some optimizations are approved or are being defined for the SIP presence protocol, but even with these optimizations a very large number of messages & large bandwidth are needed in order to establish federation between presence systems of large communities. Further thinking is needed in order to make large deployment of presence systems less resource demanding. Note that even though this document talks about inter domain traffic, the introduction of resource list servers (RLSs) introduce very similar traffic pattern in intra-domain and in inter-domain. See detailed discussion on resource lists in .

The current optimizations that are approved or are approved as working group items in the SIMPLE working group can be divided into two categories: Dialogs saving optimization - Here we refer to optimizations as the resource list RFC or to the URI list subscriptions draft . These documents define ways to reduce the number of dialogs that are required between the subscriber and the presence system. Note that dialog optimization or RLS usage as it is used in this document refers to the usage of a URI that represents a list of a URI list between domains and not within the same domain. An example is a user Alice in domain example.org that subsides to URI of e.g. external-reps-list at example.com or uses a URI list to subscribe at on her watch list in example.com. Note also that when calculating the traffic that is due to RLS within a domain the traffic between the RLS and the presence agents should also be taken into account. However, since in this document we are mostly dealing with inter- domain traffic, the traffic between the RLS and the presence agents was not taken into account. Notification optimizations - Here we refer to the optimizations that are suggested in the subnot-etags draft . This draft suggests ways to suppress the sending of unnecessary notifies when for example a subscription is refreshed. There are other drafts that reduce the size of messages as partial notifications or filtering but in this document we mostly care about the amount of messages & bandwidth so the partial optimizations can help a bit in the bandwidth but will not help in the number of messages. In addition to the above optimizations another optimization could have been considered but it is not taken into account in the computations in this document. This optimization is the ability to have some of the presence information received not by the SIP protocol but by offline means as downloading some persistent presence information directly from a web site or by some other offline means. The calculations here are based on the assumption that all data is carried in-bound of the protocol and no optimizations that enable getting the presence information via out bound means are taken into account. These optimizations may improve the number of messages and number of bytes significantly but they are out of scope for this document

In the document several assumptions are used regarding size of messages, rate of presence change and more. It should be noted that these assumptions are not directly based on rigorous statistics that was done on actual SIP based deployments of presence systems but more from some experience on other types of presence based systems. The following numbers are given more as examples from real deployments and they are not intended to be complete In a large consumer network we have seen the following patterns: Approximately 110 users in the watch list in average. There are approximately 12 billion status changes a day (139k/second) across the network. Of these, when a proprietary binary protocol is used to convey the status changes the average of the message is about 188 bytes. When SIP NOTIFY is used the average is about 1228 bytes for the message. The average of logins/logouts in the system is about 2000 logins per second and about 4000 logouts per second. When something happens - either a promotion, contest, or a network hiccup that causes many users to login and logout simultaneously, there are about 20,000 logins per second. The peak of the instant messages sent is about 50,000 messages per second. In a deployment in enterprises we have seen the following patterns: Averages watch list size was 200 users. About half of the registered users were online at peak time Status change per hour was 2 changes per hour. The average logins/logouts in the system was about 5 logins per second with additional 15 logins/logouts during start/end of day rush hours. Even though the assumptions in this document are not based on rigorous statistical data the target here is not to analyze specific system but show that even with VERY moderate assumptions (which are even less then the observations mentioned above), the number of messages, the network bandwidth, the required state management and the load on the CPU are very high. Real life systems should have a much bigger scalability challenges. for example the presence state change that we assumed (one presence state change per hour) is maybe one of the most moderate assumptions that we have taken. Experience from consumer networks show that the frequency here is much bigger and especially with the younger generation that use more presence attributes like mood etc.. In an environment where a user may have several devices and other resources for presence information as geographical location and calendar the frequency of presence state changes will be much higher. It is very hard to measure presence load since it is very much dependent on the behavior of users and behavior of users differs a lot. Some users will have a very small number of presentities in their watch list while others may have hundreds if not thousands. Some users will change their state a lot and have many sources of presence information while others may have very small number of changes during the day. In addition the "rush hour" calculations of when the day starts and ends were not included yet in this document. Rush hour differs between different enterprises and is still different in the consumer presence systems. It is very hard if not impossible to take into a static document all the possible combinations. Throughout the calculations certain number of users are assumed for the different models. It does not mean that in actual deployments all the users of the domain actually subscribed to presence documents and/or publish their presence document. Observing actual deployments shows that in the consumer market the number of users that use presence services may be 10 percent or less of the registered users. In the enterprise market numbers tend to be around 50 percent of the actual enterprise registered users. The same is correct for the number for of watched presentities per watcher. if only some percent of the domain users are online at a given time then this number should have been that percentage. However, trying to add this assumption to the calculations will make the calculations more complex then they are since the affect of the watched presentities that are not online will need to be taken into account. This means that empty notify should be sent for those when the subscription is created and there is no updates on them. In order to make the computations less complex (they are complex enough as they are), the number of the watched presentities that is used in the calculations is the number of the federated presentities from the watcher list that are online.

The basic SIP subscription dialog involves the following message- transfer: SUBSCRIBE/200 Initial NOTIFY/200 (j) NOTIFY/200 where ‘j’ is the number of presence changes seen by the watcher (k) SUBSCRIBE/200 where ‘k’ is the number of subscription dialog refresh periods SUBSCRIBE/200 with Expires = 0 to terminate the dialog NOTIFY/200 ending the dialog An individual watcher will generate X number of SIP subscription dialogs corresponding to the number of presentities it chooses to watch. The amount of traffic generated is significantly affected by several factors: Number of watchers connected to the system Number of presentities connected to the system Frequency of changes to presence information This document contains several calculations that show the expected message rate and bandwidth between presence domains. The following sections explain the assumptions and methods behind the calculations.

The following are number of "constants" that we use in the calculations. Some of the constants are used throughout the calculation while other change between use cases (C01) Subscription lifetime (hours)- The assumed lifetime of a subscription in hours. We assume 8 hours for all calculations. (C02) Presence state changes / hour - The average time that a presentity changes his/hers status in one hour. We assumed 3 times per hour for most calculations. Note that for some users in consumer messaging systems, the actual number of changes is likely to be much higher. (C03) Subscription refresh interval / hour - The duration of the SUBSCRIBE session after which it needs to be refreshed. We assumed that the duration is one hour. (C04) Total federated presentities per watcher - The number of presentities that the watcher is watching. The number here changes in this document according to the type of the specific deployment. (C05) Number of dialogs to maintain per watcher - The number of the SUBSCRIBE dialogs that are maintained per watcher. if a dialog optimization is not assumed this number is equal to A04, otherwise it is 1. (C06) Total number of watchers in the federated presence domains. The number here is the number of all watchers in all the federated domains. (C07) SUBSCRIBE message size in bytes. We assume 450 bytes in all calculations. The size is based on a typical SUBSCIRBE taken from RFCs. (C08) 200 OK for SUBSCRIBE message size in bytes. We assume 370 bytes in all calculations. The size is based on a typical 200 OK taken from RFCs. (C09) NOTIFY message size not including the presence document. The size of this message for a single presentity is assumed to be 500 bytes for the NOTIFY message itself (based on sizes from examples in RFCs). (C10) 200 OK for NOTIFY message size in bytes. We assume 370 bytes in all calculations. The size is based on a typical 200 OK taken from RFCs. (C11) Size of an average presence document. In the previous version of this document we have used only the size of 3000 bytes for a presence document. This number was calculated based on examples of rich presence document in RFCs. Due to discussion in the SIMPLE list where it was claimed that it may be too big and due to the fact that we are talking here about federation between communities where the rich presence document may be of less use, we have done all the calculations with two sizes of presence document. One size is the minimal size of the PIDF document which was taken to be 350 bytes based on examples from RFCs and the other size is the 3000 bytes for rich presence document . It should be noted that assuming 3000 bytes for presence document is relatively modest if we take into account multiple devices and location information. (C12) The size of NOTIFY when partial notification is being done. We have taken this size to be 200 bytes. The size is much smaller then the example that is given in but the example given there assumes multiple changes in the presence document and here we assume a single change. When dialog optimization is used, an RLMI document is being sent and that document contains the presence documents for the users that are in the watch list. In previous version of this document we have omitted the overhead of the RLMI document. This "bug" was found by Victoria Beltran-Martinez and is being fixed in this document by adding the constants C13, C14 and C15 to the calculations (C13) Item size per each contact in RLMI document, 160 bytes. (C14) The size of the multipart boundary in RLMI notifications, 144 bytes. (C15) The size of the XML root node in RLMI document (once per notification), 144 bytes.

The following are the calculations for the messages in the initial phase of the establishment of the subscriptions. The calculations contain both number of messages and the number of bytes. (I01) Number of initial SUBSCRIBE messages per watcher = C05. (I02) Number of initial 200 OK messages for SUBSCRIBE messages per watcher = C05. (I03) Number of initial NOTIFY messages per watcher = C05. (I04) Number of initial 200 OK messages for NOTIFY messages per watcher = C05. (I05) Total number and bytes of initial SUBSCRIBE messages for all watchers = Number - I01*C06, Bytes - I01*C06*C07. (I06) Total number and bytes of initial 200 OK for SUBSCRIBE messages for all watchers = Number - I01*C06, Bytes - I01*C06*C08. (I07) Total number and bytes of initial NOTIFY messages for all watchers = Number - I01*C06, The calculation for the number of bytes is different when dialog optimization is used or not. When dialog optimization is not applied the number of bytes will be calculated by: (I01*C06*C09)+(I01*C06*C11) and when dialog optimization is applied the number of bytes will be calculated by (I01*C06*(C09+C14+C15))+(I01*C06*C04*(C11+C13+C14)). (I08) Total number and bytes of initial 200 OK for NOTIFY messages for all watchers = Number - I04*C06, Bytes - I04*C06*C10. (I09) Total number and bytes of initial messages per day = Number - numbers in I05+I06+I07+I08, Size -sizes in I05+I06+I07+I08.

Here we describe the calculations for the steady state messages. Steady state is the time between the initial subscription and the tear down of the subscription. It contains the notifies due to state change and the subscription refreshes. (S01) NOTIFY messages due to state change per watched presentity per day (less 2 since the NOTIFY for initial and terminating state is calculated in the initial and terminating calculations) = (C02*C01-2). (S02) 200 (for NOTIFY due to state change) messages per watched presentity per day (less 2 since the NOTIFY for initial and terminating state is calculated in the initial and terminating calculations) = (C02*C01-2). (S03) Total number and size of messages due to state change per day = Number - (S01+S02)*C06*C04. The calculation for the number of bytes is different when dialog optimization is used or not. When dialog optimization is not applied the number of bytes will be calculated by: (C06*C04)*((S01*(C09+C11))+(S02*C10)) and when dialog optimization is applied the number of bytes will be calculated by (C06*C04)*((S01*(C09+C11+C13+C14+C15+C14))+(S02*C10)). This includes the the multipart boundary of the resource list. Note that for dialog optimization it is assumed that only a single presentity is changed and partial state notification is used. (S04) Number of SUBSCRIBE messages for refreshes per watcher per day = ((C01/C03)-1)*C05. One is subtracted since the termination is calculated separately. for example if there are 8 hours in the day and a refresh should occur every hour, there are 7 refreshes during the day and not 8. (S05) Number of 200 OK messages for SUBSCRIBE messages for refreshes per watcher per day = ((C01/C03)-1)*C05. (S06) Number of NOTIFY messages for refreshes per watcher per day = ((C01/C03)-1)*C05. Since when NOTIFY optimization is used there is no need to send NOTIFY for refreshes, S06 will be zero when NOTIFY optimizations is used. (S07) Number of 200 OK messages for NOTIFY messages for refreshes per watcher per day = ((C01/C03)-1)*C05. Since when NOTIFY optimization is used there is no need to send NOTIFY for refreshes, S07 will be zero when NOTIFY optimizations is used. (S08) Total number and size of messages due to SUBSCRIBE refreshes per day = Number - (S04+S05+S06+S07)*C06. The number of bytes is calculated by adding the SUBSCIRBE bytes (S04*C06*C07), the OK for SUBSCRIBE bytes (S05*C06*C08), the NOTIFY bytes C06*(S06*(C09+C11)) and the OK for NOTIFY (S07*C06*C10). Note that the formula for the notify bytes is for the dialog optimization is not used and when it used the formula will be: C06*(S06*((C09+C14+C15)+(C04*(C11+C13+C14)))). Note that a full state should be given in SUBSCRIBE refreshes in resource lists. See section 5.2 in . The fact that the full state needs to be returned in a NOTIFY response to refresh makes the NOTIFY optimization more efficient in conjunction with the dialog optimization. (S09) Total number and bytes of steady messages per day = Number - numbers in S03+S08, Bytes - sizes in S03+S08.

The following are the calculations for the messages in the termination phase of the of the subscriptions. The calculations contain both number of messages and the number of bytes. (T01) Number of terminating SUBSCRIBE messages per watcher = C05. (T02) Number of terminating 200 OK messages for SUBSCRIBE messages per watcher = C05. (T03) Number of terminating NOTIFY messages per watcher = C05. Since when NOTIFY optimization is used there is no need to send NOTIFY for terminations, T03 will be zero when NOTIFY optimization is used. (T04) Number of terminating 200 OK messages for NOTIFY messages per watcher = C05. Since when NOTIFY optimization is used there is no need to send NOTIFY for terminations, T04 will be zero when NOTIFY optimization is used. (T05) Total number and bytes of terminating SUBSCRIBE messages for all watchers = Number - T01*C06, Bytes - T01*C06*C07. (T06) Total number and bytes of terminating 200 OK for SUBSCRIBE messages for all watchers = Number - T01*C06, Bytes - T01*C06*C08. (T07) Total number and bytes of terminating NOTIFY messages for all watchers = Number - T01*C06, The number of bytes is calculated to be: (T03*C06*(C09+C11) when dialog optimization is not used and: (T03*C06*(C09+C14+C15))+(T03*C06*C04*(C11+C13+C14)) when dialog optimization is used. Note that a full state should be given in SUBSCRIBE refreshes in resource lists. See section 5.2 in . (T08) Total number and bytes of terminating 200 OK for NOTIFY messages for all watchers = Number - T04*C06, Bytes - T04*C06*C10. (T09) Total number and bytes of terminating messages per day = Number - numbers in T05+T06+T07+T08, Size -sizes in T05+T06+T07+T08.

The following are the calculations of several totals that are based on the above calculations. (B01) Total number of messages and bytes during the day = Messages - Number of messages in I09+S09+T09, Bytes - Number of bytes in I09+S09+T09. (B02) Total number of messages and bytes per second = Messages - Number of messages in B01/(C01*3600) Bytes - Number of bytes in B01/(C01*3600). (B02) Total number of message and bytes per user per day = Messages - number of messages in B01/C06 Bytes - Number of bytes in B01/C06.

With the way that the calculations are built, it is relatively easy to see the affect of rush hours at the beginning and the end of the day. for the beginning of the day we should look at the numbers of "(I09) Total number and bytes of initial messages per day" and for the end of the day we should look at the number of "(T09) Total number and bytes of terminating messages per day". Taking these numbers with some assumed percentage of the numbers of users that log in at the same hour should give good indication for the rush hour load.

The following table uses some common presence characteristics to demonstrate the effect these factors have on state and message rate within a presence domain using base SIP protocols without any proposed optimizations. In this example, there are two presence domains with total of 40,000 federating users with an average of 4 contacts in the peer domain. Note that the main calculation is done for a presence document size of 350 bytes which is the base PIDF document size but the bottom line calculation is also given for a presence document size for rich presence which is assumed to be 3000 bytes based on the examples given in the RFCs. This two folded calculation is done for every use case in this document.

The same analysis provided above is repeated here with the assumption that the dialog optimization is applied. Note that while the sign-in (ramp up) and sign-out messages flows are positively affected, the steady state rates are not.

The initial analysis of analysis provided in is repeated here with the assumption that the notify optimization is applied. The optimization saves the need for NOTIFY upon refreshing a SUBSCRIBE if there was no change since the last NOTIFY. It is assumed here that there will be no NOTIFY message for a SUBSCRIBE refreshes and terminations. As should be expected this optimization affects the steady and termination state and does not affect the initial state.

Here both optimizations are combined. In all the subsequent use cases we will show only the analysis with no optimizations and with both optimizations combined.

While scalability issues exist in any large deployment, certain characteristics make the deployment conducive to the existing optimizations, and others have characteristics that do not. Following is a list of federation scenarios that have varying usage characteristics. For each, a message rate and bandwidth table is provided reflecting typical changes message rates. Those characteristics can alter the overall effectiveness of existing optimizations. Note that the number of users used is not the number of the users in the domains but the actual logged in users. As was mentioned before not all the domain users will use the presence service at the same time. The number used for number of watchers and number of watched presentities are for online users.

In some environments presence federation may be very common, perhaps even more common than intra-domain presence. An example of this type of environment is a small ISV or public server. Users in that small ISV are not likely to subscribe to the presence of other users in the their server since they do not necessarily have any relationship with each other aside from receiving service from the same provider. They are much more likely to be subscribed to the presence of users in one of the federated domains (whether in consumer domains, academic, other ISVs, etc). Common characteristics of this deployment are: Federated subscriptions are the majority of subscription traffic Individual users are likely to subscribe to multiple users in any one domain The intersection of users in the deployment watching the same presentities is quite small (i.e., probability that watchers in the domain subscribe to the same presentity is low) To account for the extraordinarily high percentage of federation traffic, the number of federated presentities is increased to 20. The number of watchers in the domain could also be adjusted to account for an expected larger community of users being peered with, it is omitted here for simplification The first table below provides the calculations without optimizations the second table provides the calculations with optimization.

In this type of environment, the domain is a collection of associated users such as an enterprise. Here, federation is once again very common. However, there is also a strong association between some users in the deployment. These associations make it somewhat more likely that users in that domain will be watchers of the same presentity. This can occur because of business relationships (e.g. two co-workers on a project federating with a partner company). Common characteristics of this deployment are: Federated subscriptions are large minority or small majority of subscription traffic Individual users are likely to subscribe to multiple users in any one domain, especially their own The intersection of users in the deployment watching the same presentities increases This federation type has traffic rates similar to the previous examples but with different levels of association of the users.

In this environment, two or more very large networks create a peering relationship allowing their users to subscribe to presence in the other domains. Where as the number of users in other deployment types ranges from hundreds to several hundred thousand, these large networks host up to hundreds of millions of users. Examples of these networks are large wireless carriers and consumer IM networks. Common characteristics of this deployment are: As users become accustomed to network boundaries disappearing, federated subscriptions become as common as subscriptions within the same domain Individual users are highly likely to want to see presence of multiple presentities in the peer network The intersection of users in the deployment watching the same presentities is very high (i.e., two or more users in network A are extremely likely to be watching a same user in network B) Status changes increase greatly due to typical observed consumer behavior The first table below provides the calculations without optimizations the second table provides the calculations with optimizations. Even though the optimizations help a lot (almost cut the number of messages by half), the numbers are still very high. Note also that the bandwidth required is very high.

Within a particular domain, multiple presence infrastructures are deployed with users split between the two. This scenario is unique in that federated messages do not pass outside the administrative domain's network. The two infrastructures peer directly inside the domain. A common example of this is an enterprise IT system with multiple independent vendor presence solutions deployed (e.g., a presence solution for desktop messaging deployed alongside a presence solution for IP telephony). Common characteristics of this deployment are The difference between subscriptions to presentities in one system vs. the other are completely arbitrary. Any one presentity is as likely to be homed on one infrastructure as the other. Active users are almost guaranteed of subscribing to many users in the peer infrastructure. The level of intersection of presentities is extremely high. The first table below provides the calculations without optimizations the second table provides the calculations with optimization. Even though the relatively conservative numbers are used, the amount of messages is still very high even though optimization may cut the traffic by more then half

Draft define a way for the watcher to request getting only what was changed in the presence document. The following is a calculation of the bandwidth that is saved in the very large peering network case, when we add the partial notification optimization to the dialog and NOTIFY optimization. It is assumed that except for the initial NOTIFY all the other NOTIFY messages will be partial. It is also assumed that only a single attribute in the presence document will be changed each time, thus the size of the partial presence document is assumed to be 200 bytes.

SIP is network agnostic protocol, therefore, the protocol carries additional messages like 200 OK that would have been redundant in a protocol that is TCP based only. The following calculation assumes an imaginary TCP only based version of SIP that optimizes the following: There is no 200 OK for each message. Since only TCP has to be supported, there is not need to compensate for network issues. There is no refresh for subscriptions. There is no NOTIFY upon termination of SUBSCRIPTION The size of each message is smaller since there is no need for the various headers that SIP uses for routing etc. So we need to assume smaller message sizes while we will keep the size of the presence document the same. As notes above the calculations in this document do not assume offline means of getting parts of the presence information. Therefore, in addition to the above optimizations, the other optimizations that were assumed in the document will be assumed here also. These includes partial notifications and the dialog optimizations. The NOTIFY optimization is not relevant here since there are no refreshes of subscriptions. The following is a calculation for the very large networks peering scenario assuming the imaginary TCP only SIP. It is very interesting to note that the dialog optimization does not reduce the number of bytes when partial notification optimization is applied (on the contrary) due to the RLMI overhead.

In previous sections we have discussed the big amount of messages that need to be sent to/from a presence server In this section the state that needs to be maintained by a presence server will be analyzed and shown to be far from trivial. The presence server has two parallel tasks. Maintain the state of the presentities to which watchers subscribe. Maintain the state of the subscriptions of watchers and provide timely updates to the watchers. For a single subscription from a single watcher on a presentity, the presence server has to maintain the following state: Subscription state including all the parameters that are needed in order to maintain the subscription as timers. Optional filtering information that was requested by the watcher. This includes enough information that is needed for doing the filtering. In addition additional information has to be maintained if partial notification is being supported for the subscription Optional rate management information as throttling Watcher information , that is the result of the subscription in order to enable watched presentities to see who is watching them. For each presentity that has been subscribed to in the presence server, the presence server has to maintain the following state: A list of the subscriptions for the presentity. Note that this is already taken care of from the size calculation point of view by the subscription state above. Authorization information for the presentity. For each presentity for which there was any publication and the presentity has a state other then a default value, the presence server has to maintain the current value of the presentity.

Lets assume the following sizes: Subscription size - 2K bytes. This includes watcher information that need to be created by the presence server for each subscription. This is for each subscription that is done by each watcher to each presentity that the watcher is watching. So if we have 10K watchers we should have 10K of these. Subscribed to resource - 1K bytes (for privacy information and other management info). This is for each presentity that is being watched. No matter how many watchers are watching it. The subscriptions themselves are already calculated in the previous bullet. Resource with a state - 6K bytes. This is a moderate assumption if we take into account the amount of data that is being put in a presence document as multiple devices, calendar and geographical information. This is for each presentity that has state other then the default empty state. It does not matter if it is being watched or not.

10K subscriptions = 19M bytes. 5K subscribed to presentities = 5M bytes. 10K presentities with state = 58M bytes. Total is 82M bytes.

100K subscriptions = 195M bytes. 50K subscribed to presentities = 49M bytes. 100K presentities with state = 586M bytes. Total is 830M bytes.

6M subscriptions = 11,718M bytes. 3M subscribed to presentities = 2,929M bytes. 4M presentities with state = 23437M bytes. Total is 38G bytes.

150M subscriptions = 292,969M bytes. 75M subscribed to presentities = 73,242M bytes. 100M presentities with state = 585,937M bytes. Total is 952G bytes which is a very big number for a very dynamic storage as needed by the presence server.

Although the numbers above may seem moderate enough for the sizes that the presence server is handling we should consider the following: Dynamic state - Although the state may seem not so big for databases even for the very large system, we need to remember that this state is a very dynamic state. Subscriptions come and go all the time, the status of presentities is being updated and so forth. This means that the presence server has to manage its state in a medium that is very dynamic and for such large sizes this task is not trivial. Interlinked state - The subscriptions and the subscribed to presentities are dependent on each other. There needs to be a link from the presentity to the subscriptions and vice versa. See about the interlinkage that is created due to resource lists. Moderate assumptions - The size assumptions that were made above are quite moderate. As presence is becoming more a core middleware functionality that holds a lot of data on the user. In real-life the numbers above may be even higher and the presence server can have additional overhead as managing the SIP sessions, networking and more. Although the calculations above do not show that there is a real issue with state management of presence in medium systems or even in big systems since it should be possible to divide the state between different machines, the state size is still very big. A bigger issue with the state is more when resource lists are involved and create an interlinked state between many servers. In that case the division of very big state to multiple servers becomes less trivial...

The basic presence paradigm consists from a watcher and a presentity to which the watcher watches. It sounds simple enough but there are many additions and extensions that the presence server has to manage that make the processing of the presence server very complex. In this section we show that in addition to the large amount of messages and the big state that the presence server has to handle, it has also to handle quite intensive processing for aggregation, partial notify and publish, filtering and privacy. This adds another complexity to the presence server in the CPU front in addition to the network and memory fronts that were described before.

A presence document may contain multiple resources. These resources can be devices of the presentity, information that is received form external providers of presence information for the presentity as geographical and calendar information and more. The presence server needs to be able to get the updates from all the resources and aggregate them correctly into a single presence document. Although this is just "XML processing" task, the amount of updates that the presence server may get, the need to keep the presence document aligned with its schema and the need to notify the users as soon as possible create a significant processing burden on the presence server

Drafts , define a way for the watcher to request getting only what was changed in the presence document and for the publisher of presence information to publish only what was changed in the presence document since the last publish. Although these optimizations help in reducing the amount of the data that is sent from/to the presence server, these optimizations create additional processing burden on the presence server. When a partial publish is arriving to the presence server, the presence server has to be able to process the partial publish, change only what is indicated in the partial publish while keeping the presence document in a well formed shape according to the schema. In partial notify the processing is even more complex since each watcher needs to get the partial update based on the last update that was received by that watcher. Therefore specifies a versioning mechanism that enables the watcher to get the updates based on the previous state that it has seen. This versioning mechanism has to be maintained by the presence server for each watcher that is subscribed to a presentity and requires partial notify.

Filtering as defined in RFCs , enables a watcher to request to be notified only when the presence document fulfills certain conditions. Although this is a very convenient feature for watchers, the burden that is put on the presence server is quite big. For each change in the presence document, the presence server needs to compute the filtering expressions which can be very complex, decide whether and what to send to the watcher that have requested filtering.

Draft defines presence authorization rules that can be used by presentities to define who can see what from their presence documents. The processing that the presence server has to do here is very similar to filtering. When there is a change to any presence document that has privacy defined for it, the presence server needs to create different notification for different watchers according to what is defined in the authorization rules.

RFC defines a way to subscribe on a single URI while that URI is actually a list of resources that are being subscribed to by a single subscription. Although this is quite useful mechanism and it significantly saves on the number of sessions between the watcher and the presence server (as we show in the calculations of messages), this feature has the potential to make the scalability issue of presence systems harder and more complex. The reasons that resource lists may make the scalability problem of the presence server even more complex are: Subscriptions and state - The resource list may contain reference to many other presence servers in many other domains. This requires the RLS to create subscriptions to other presence servers and buffer the state of all presentities in order to be able to provide the full state of the presentities in the list when needed. So in the overall system, the subscriptions that were saved between the watcher and the presence server are moved to the backend system while state has been duplicated between the various presence servers that serve the various presentities and the RLSs. This issue could have been mitigated if there was a way for the RLS to retrieve the presence information for many watchers while adhering to privacy when sending the actual notifications to the watchers. Interlinkage - The resource list subscription will reach one RLS that will open it and send it to many presence servers and to other RLSs (if there is a subgroup inside the list). This way a complex linkage between the state of many components is created. This linkage makes state management and other maintenance of a presence systems quite complex. Big lists are easy - There are two types of groups that may be used with this feature, private groups that are defined by/for each watcher and public groups that are defined in the system and can be used by any watcher. Although we should expect IT administrators to take caution when creating public groups, this may be not the case in real life. The connection between the size of the public group and the load on the presence server system may not apparent to everyone. Furthermore many public groups that are used in presence systems may have been created for other purposes as email systems (where the size of the lists was not so important) and are taken as they are to presence systems. So for example we may very easily find that a public group that actually covers all the users in the enterprise are used by many users in the enterprise thus creating unbearable load on the presence server. Note that this issue is not a protocol or design issue but more a usage issue that may have a real impact on the presence system. Stopping notifications - A watcher may accidentally subscribe to a very big list and be overwhelmed by the amount of notifies that it receives from the presence server. There is no current way to stop this stream of notifies and even canceling the subscription may take time until being affective. The issues mentioned above are one example of an optimization that helps in one part of the system but creates even bigger problems in the overall system. There is a need to think about the problems listed above but more then that there is a need to make sure that when an optimization is introduced it does not create issues in other places.

This section lists and discusses several optimizations that are either already part of the SIP protocol or they have been suggested in various drafts. Several other optimizations that have been suggested but have not been discussed in any working group yet are summarized in and in . Note that trials with batched notifies optimization that is describes in , showed an improvement of 117% in the whole throughput of presence traffic. Subnot-etags - Draft . This draft suggests ways to suppress the sending of unnecessary notifies when for example a subscription is refreshed. This suggestion seems to be an efficient optimization since it saves both the number of messages sent and on the processing time of the presence server. Resource List Service - enable creating a single subscription session between the watcher and the presence server for subscribing on a list of users. This saves the amount of sessions that are created between watchers and presence servers. On the other hand, this mechanism enables creating very large amount of subscriptions in the presence server/RLS system thus enabling the creation of a very large number of subscriptions between presence servers and RLSs with relatively few clients especially if large public groups are used. It seems that in order to really optimize in this area, the usage of large public groups should not be considered as BCP and there should be a way for an RLS to create a single subscription for multiple occurrences of the same resource in resource lists. See consolidates subscriptions below. Partial notify/publish - Drafts , define a way for the subscriber to request getting only what was changed in the presence document and for the publisher of presence information to publish only what was changed in the presence document since the last publish. Although these optimizations help in reducing the amount of actual data that is sent from/to the presence server, these optimizations create additional processing burden on the presence server as was discussed above. Filtering as defined in RFCs , enables a watcher to request to be notified only when the presence document fulfills certain conditions. Although this optimization enables saving on the amount of messages that are sent from the presence server to the watcher, this optimization puts more burden on the processing time of the presence server as was discussed above. Throttling defines a mechanism in which a watcher requires to be updated only in certain intervals. Although this mechanism may give some extra load on the processing time of the presence server, that load is negligible and the reduction on the amount of messages sent from the presence server to the watchers is significant. This optimization is even more important with resource lists where there can be many resources in the resource lists and if the traffic of updates on resource list is not regulated, the watcher may get very large amount of notifications. Presence specific sigcomp dictionary defines a SIGCOMP dictionary for presence. This optimization will enable to reduce the number of bytes that are transferred in presence systems by compressing the textual SIP messages and using the specialized presence dictionary the compression may be more significant then just using SIGCOMP as is. Note that number of actual messages will remain the same and a calculation of the amount of bytes that will be saved may be useful here. Content Indirection enables sending only the URI of the presence document to the watcher thus offloading the presence server from sending the presence document to the watcher. This optimization may be useful in some cases especially where there is a big number of users that get the same presence document.

Following is a summary of the various calculations. This is repeated here in order to ease the understanding of the conclusions that are listed below. The following table summarizes the various constants that are used in ALL calculations.

The following table summarizes the results of various optimization factors for the basic use case.

The following table summarizes the results of various optimization factors for the widely distributed inter domain use case.

The following table summarizes the results of various optimization factors for the intra-domain peering use case.

The following table summarizes the results of various optimization factors for the very large scale peering networks use case.

The following conclusions can be drawn from the above numbers: Due to the overhead of RLMI, the dialog optimization does not help in reducing the number of bytes nor in the number of the messages. It seems to be more important from the point of view of the convenience of the user since it enables the user to manage his/hers watch list on e.g. a web page. The notify optimization optimizes both the number of messages and the number of bytes. Partial notification saves a lot in the number of bytes especially when the presence document is a rich presence document which is relatively big. Comparing to very optimized SIP protocol (imaginary TCP only SIP) shows that the number of messages is less by about a half. The number of bytes is also reduced by about a half. When looking at the numbers from the perspective of the number of bytes that a user "consumes" per day the numbers may not look so big. Nevertheless, we should remember that the overall affect on the network may be quite big since the network will have to convey dozens of Giga bytes per day for the modest use cases that are described in this document for presence traffic only. Recalling that presence is only an enabler for other media these numbers are not so easy to handle. The document analyzes the scalability of presence systems and of the SIP based in particular. It is apparent that the scalability of these systems is far from being trivial from several perspectives: number of messages, network bandwidth, state management and CPU load. As part of the analysis we have analyzed several optimizations and showed the effect of these optimizations on the number of messages and the number of bytes that are sent between the federating domains. We have also computed the number of messages and bytes for a very large scale peering network while assuming a protocol that has much less overhead then SIP. Even in that protocol we got relatively high numbers. It is very possible that the issues that are described in this document are inherent to presence systems in general and not specific to the SIMPLE protocol. Organizations need to be prepared to invest a lot in network and hardware in order to create real big systems. However, it is apparent that not all the possible optimizations were done yet and further work is needed in the IETF in order to provide better scalability Nevertheless, we should remember that SIP was originally designed for end to end session creation and number and size of messages are of secondary importance for end to end session negotiation. For large scale and especially for very large scale presence the number of messages that are needed and the size of each message are of extreme importance. It seems that we need to think about the problem in a different way. We need to think about scalability as part of the protocol design. The IETF tends not to think about actual deployments when designing a protocol but in this case it seems that if we do not think about scalability with the protocol design it will be very hard to scale. We should also consider whether using the same protocol between clients and servers and between servers is a good choice with this problem? It may be that in interdomain or even between servers in the same domain (as between RLSs and presence servers) there is a need to have a different protocol that will be very optimized for the load and can assume some assumptions about the network (e.g. do not use unreliable protocol as UDP but only TCP). When servers is connecting to another server using current protocol, there will be an extreme number of redundant messages due to the overhead of supporting UDP and to the need to send multiple presence documents for the same watched user due to privacy issue. A server to server protocol will have to address these issues. Some initial work to address these issues can be found in: , and Another issue that is more concerning protocol design is whether NOTIFY messages should not be considered as media as audio, video and even text messaging are considered? The SUBSCRIBE can be extended to do similar three way handshake as INVITE and negotiate where the notify messages should go, rate and other parameters. This way the load can be offloaded to a specialized NOTIFY "relays" thus not loading the control path of SIP. One of the possible ideas (Marc Willekens) is to use the SIP stack for the client/server NOTIFY but make use of a more optimized and controllable protocol for the server-to-server interface. Another possibility is to use the MSRP , protocol for the notifies.

This document discusses scalability issues with the existing SIP/SIMPLE presence protocol and model. Therefore, there are no security considerations to be considered for this document. However, a lot of the possible optimizations that should emerge as a result of this document will have security implications that will need to be solved.

This document has no actions for IANA.

Fixed mistakes in calculations that were found by Victoria Beltran-Martinez, both relate to dialog optimizations. One mistake was not including the multipart boundary of the resource list itself in S03 when dialog optimizations were used. The other one was assuming in T07 that only a single presentity is returned in termination in T07 calculation. Fixed nits that were referred to me by Robert Sparks

Fixed mistake in the formula of I07 and S08 (RLMI was not included). Affect on total number of bytes was very small. Fixed mistake in the text of the calculation of number of bytes for S08 for non dialog optimization. No actual change in number of bytes since the excel file calculations were done correctly. Removed general references throughout the text to "other protocols". This was done in order to avoid the impression that the document tries to compare SIP protocol with any other presence base protocol. Several other editorial and clarification changes

Added some input from real life deployments and input on a test with batched notifies. Added Calculations of messages and bytes per user. Calculations are now done both for minimal size of presence document and for an average size of rich presence document. Comparison with other protocol is now done using small, tiny and rich presence document sizes. Removed dialog optimization with partial notification since it is not relevant Fixed a few issues in calculations that were found by Victoria Beltran-Martinez. Added overhead for RLMI for dialog optimizations (list subscription). This calculation fix actually shows that dialog optimization is not a real optimization from the point of view of bytes and number of messages. When NOTIFY optimizations are applied no need for final NOTIFY The usage of RLS between domains was clarified. Significantly enhanced the conclusions section Several typo fixes

Fixed a bug in the calculations. Thanks to Marc Willekens for finding the bug.

Clarifications and corrections of the computation model and the computations. Added several more computations to show the influence of different optimizations. The requirements were moved to The new suggestions for optimizations were moved to

We would like to thank Jonathan Rosenberg, Ben Campbell, Robert Sparks, Markus Isomaki Piotr Boni, David Viamonte, Aki Niemi and Peter-Saint Andre for ideas and input. Special thanks to Marc Willekens and Victoria Beltran-Martinez for finding several issues in the calculations.