Tackling File Transfers with the XMPPFramework

One of the projects I'm currently working on relies heavily on XMPP (Extensible Messaging and Presence Protocol) for a part of its core functionality, and more specifically, using XMPP to transfer files.

The most common library used to handle XMPP on iOS is the XMPPFramework, built primarily by Robbie Hanson. However, the XMPPFramework lacks support for file transers.

After spending hours searching for solutions online, I decided to fork the repo and build the extension myself. I know there are a lot of people out there looking for this, so here you go—I've built a turnkey solution.


I'm going to spend a good deal of this post explaining the file transfer process, so for those of you who just want the solution, take a look at my demo app which gives an example of how to use the XMPPFramework extension in your own project.

You can also check out my fork of the repo for the source. I'll update this if/when it gets merged back into the official repo.

Protocols Overview

Let's face it, the file transfer process can be quite a pain. The only true documentation on it comes from the xmpp.org website itself, and at times I found it much less explicit than I would have liked.

I don't intend to dive too deeply into how it works, but I'll provide an overview here and let you look at the code for a more in-depth look (I made the documentation very straightforward).

In order to perform the transfer, we're required to pull different pieces from XEP-0096, XEP-0065, and XEP-0047.

XEP-0096: SI File Transfer

This protocol is used for initiating the stream (SI = Stream Initiation) and sending the details of the incoming or outgoing file. While this is a crucial and required step in the process, XEP-0096 doesn't do much more than that.

According to XEP-0096:

In order to enable seamless file transfer and appropriate fall-back mechanisms, implementations of this profile MUST support both SOCKS5 Bytestreams (XEP-0065) and In-Band Bytestreams (XEP-0047), to be preferred in that order.

XEP-0065: SOCKS5 Bytestreams

After sending or receiving the details of the file, we use XEP-0065 (if available) to actually transfer the data. Socket Secure (SOCKS5) is a protocol that allows packets to be sent from one device to another either via direct connection or through a proxy server.

A direct connection is preferable, since it will have obvious speed advantages and doesn't place any strain on your server. However, if either of the parties involved in the transfer are behind a Network Address Translation (NAT) device, a direct connection won't be possible. I'll let you read up on the history of how stupid IPv4 is, and why NAT exists/wreaks havoc on establishing TCP connections.

Long story short, this means a direct connection will probably be out of the question any time one or more of the parties is connected to the internet in a LAN—which most of us are (i.e. my laptop or my phone when it's not on cellular data)—thus requiring a proxy.

Once a connection is established, the file being transferred is written to and read from the bytestream connecting the two parties.

XEP-0047: In-Band Bytestreams

In-Band Bytestreams (IBB) can be considered the fallback plan to SOCKS5. Whereas SOCKS5 creates a connection between the sender and the receiver, IBB simply uses the existing messaging stream the server has with both parties. IBB sends blocks of base64-encoded data inside <iq/> stanzas.

Although this is generally a safe way to transfer data because almost every XMPP system will have support for IBB, this safety comes with some heavy limitations. The data must be broken into blocks and sent piecemeal. Proper implementations will send one block of data and wait for a response from the recipient before sending the next block. This ensures the data is received properly, but it also slows down the transfer significantly. Additionally, base64 has a 33% overhead for all data sent. This doesn't include the overhead of the <iq/> stanza itself.

IBB works consistently, but it should always be used as an alternative only if SOCKS5 isn't available.

The Transfer Process

To avoid redundancy, I'm only going to describe the outgoing file transfer process. Handling an incoming file transfer is no different except the steps are swapped. Note that each step is described under the presumption that everything went smoothly.

Client 1 (Sender)

JabberID: deckardcain@sanctuary.org/tristram
IP Address:
Filename: Baal's Soulstone.jpg

Client 2 (Receiver)

JabberID: tyrael@sanctuary.org/talrashastomb
IP Address:

1. Discover Recipient Capabilities

The very first thing we need to do is send a disco#info request to the recipient in order to ensure that they support file transfers and SOCKS5/IBB.

We will send something like this:

<iq xmlns="jabber:client"  
  <query xmlns="http://jabber.org/protocol/disco#info"/>

2. Client 2 Sends Supported Features

The recipient will respone with a list of the features supported on their device:

<iq type="result"  
  <query xmlns="http://jabber.org/protocol/disco#info">
    <identity category="client" type="phone"/>
    <feature var="http://jabber.org/protocol/si"/>
    <feature var="http://jabber.org/protocol/si/profile/file-transfer"/>
    <feature var="http://jabber.org/protocol/bytestreams"/>

3. Client 1 Sends a Stream Initiation Offer

Once the sender has verified that the recipient supports file transfers, the sender will create a Stream Initiation offer:

<iq xmlns="jabber:client"  
 <si xmlns="http://jabber.org/protocol/si"
   <file xmlns="http://jabber.org/protocol/si/profile/file-transfer"
         name="Baal's Soulstone.jpg"
     <desc>We should destroy this, right?</desc>
   <feature xmlns="http://jabber.org/protocol/feature-neg">
     <x xmlns="jabber:x:data" type="form">
       <field var="stream-method" type="list-single">

There are a couple important things to note here. First, the <si/> tag contains an id. We need to store this value as it will be used later. Second, client 1 sends the features it supports (bytestreams and ibb) in the order of preference.

4. Client 2 Responds to the SI Offer

Depending on the stream-method the recipient prefers, the response may differ slightly. If they have accepted with SOCKS5, expect something similar to this:

<iq type="result"  
 <si xmlns="http://jabber.org/protocol/si">
   <feature xmlns="http://jabber.org/protocol/feature-neg">
     <x xmlns="jabber:x:data" type="submit">
       <field var="stream-method">

5. Client 1 Sends a List of Streamhosts

If the recipient agrees to the file transfer and wishes to use SOCKS5, it is the sender's responsibility so collect a list of streamhosts that can be used for the transfer. In order to do this, the sender needs to discover its own IP address and also find out proxy server information.

Server information is obtained by a disco#items query followed up with a disco#info query to each service returned by the server.

Something like this will then be sent to the file recipient:

<iq from='deckardcain@sanctuary.org/tristram'  
  <query xmlns='http://jabber.org/protocol/bytestreams'
    <streamhost jid='deckardcain@sanctuary.org/tristram'
    <streamhost jid='proxy.sanctuary.org'

6. Client 2 Attempts a Connection

Upon receiving the list of streamhosts, client 2 will attempt to connect to each one in order until either a connection is established or there are no more streamhosts.

The actual process of connecting and authenticating is laid out in RFC 1928, and my code should explain it quite well, so I won't get into that here.

7. Client 2 Sends Acknowledgement of the Streamhost

Once the recipient has established a connection with one of the provided streamhosts, an <iq/> stanza containing that particular streamhost needs to be sent back to client 1. This indicates that the transfer can now begin.

<iq xmlns="jabber:client"  
 <query xmlns="http://jabber.org/protocol/bytestreams">
   <streamhost-used jid="deckardcain@sanctuary.org/tristram"/>

If the jid provided is the full JabberID of the sender, we know that this is a direct connection. Otherwise, the jid will be something like proxy.sanctuary.org.

8. Client 1 Writes the Data to the Bytestream

When the sender receives the <streamhost-used/> stanza above, it should begin writing the data to the bytestream that has been negotiated.

In many cases (mine included), the socket will disconnect after all the data has been written.

9. Client 2 Reads the Data from the Bytestream

This is pretty obvious—the recipient reads the data that is being sent. That data is then handled accordingly. More than likely, the socket will disconnect here.

10. You're Finished!

At last...

That's enough for one post. I didn't cover how the IBB transfer process works, but it's a bit more straightforward than SOCKS5. Again, these classes walk step-by-step through everything, so take a look at them if you need more info.

My next post will show you how to use the File Transfer extension. As always, if you have questions, hit me up on Twitter @jonathonstaff.