Protocol
========

Jami account creation
---------------------

A **Jami account** is defined by an **RSA key pair** with a key length
of at least 4096 bits.

The standard x509 160-bits fingerprint of the account public key is
called the **RingID**.

The account public key is used as the subject of an x509 certificate
that must be valid, have the Certificate Authority flag set, and can be
self-signed. This certificate is called the **Jami account
certificate**.

The subject UID field of the account certificate must be the hexadecimal
form of the public key fingerprint. The issuer UID field must be the
hexadecimal form of the issuer public key fingerprint.

### Persisting the account

Persisting a Jami account private key and certificate is implementation
defined.

Access to a saved Jami account private key must be authenticated and
authorized. Authentication and authorization method to access the
account private key is implementation defined.

### Adding a device to a Jami account

*See [RFC 5280](https://tools.ietf.org/html/rfc5280)*

A **device** is defined by an RSA key pair with a key length of at least
4096 bits.

A **device certificate** is defined as an x509 certificate whose subject
is a device public key, signed with an account private key. The
certificate MUST be valid. The issuer UID field MUST be the hexadecimal
form of the account public key fingerprint.

Persisting a device private key and certificate is implementation
defined. Access to a saved device private key should be authenticated.
Authentication method to access the device private key is implementation
defined.

### Removing a device from a Jami account

A device can be "removed" from a Jami account through revocation of the
device certificate. Revoked device certificates are added to one or more
standard x509 Certificate Revocation List (CRL). CRLs for revoked device
must be valid and signed with the corresponding CA key, which is the
Jami account private key.

### Account transmission format

The **account archive format** defines how to serialize an account
private key for transmission, for instance to sign a new device
certificate.

The account archive is an encrypted JSON object with the following
structure:

```
{
    "ringAccountKey": (PEM-encoded account private key string),
    "ringAccountCert": (PEM-encoded account certificate string),
    "ringAccountCRL": (PEM-encoded account CRL string)
}
```

The JSON object can contain additional implementation-defined key-value
pairs. Implementation-defined key names shouldn't start with "ring".

The string JSON object is encrypted using a key defined as :

```
salt = PIN + timestamp
key = argon2(password, salt)
```

Where PIN is a random 32bits number in hexadecimal form, "+" is string
concatenation, timestamp is the current UNIX timestamp divided by 1200
(20 minutes) and password is a user-chosen password.

The PIN should be shown to the user to be copied manually on the new
physical device along with the password.

Contacting another account
--------------------------

### ICE descriptor exchange over OpenDHT

-   **Listening for incoming calls**

A device listens for incoming call by performing a listen OpenDHT
operation on

`h("callto"+deviceID)`

where h is SHA1, "+" is the string concatenation and deviceID is the
hexadecimal form of the deviceID.

Received OpenDHT values that are not encrypted or not properly signed
must be dropped. The value must be encrypted with the called device
public key and signed with the calling device private key according to
OpenDHT specifications.

-   **Sending the Initial Offer**

*See [RFC 5245](https://tools.ietf.org/html/rfc5245)*

RFC 5245 defines ICE (Interactive Connectivity Establishment), a
protocol for NAT traversal.

ICE is used in Jami to establish a peer-to-peer communication between
two devices.

The calling device gathers candidates and build an Initial Offer
according to the ICE specifications and starts the ICE negotiation
process.

The calling device puts the encrypted ICE offer (the Initial Offer) on
the DHT at h("callto"+deviceID) where deviceID is the hexadecimal form
of the called deviceID.

-   **ICE serialization format**

ICE messages exchanged between peers during a call setup use following
format. An ICE message is a chunk of binary data, following
[msgpack](http://msgpack.org/) data format.

This protocol is a compound of msgpack values, successively packed in
this order:


+  an integer giving the version of ICE message format protocol used for the rest of the data. Current defined protocol version is **1**.
+  a 2-elements array of strings of the ICE local session ufrag and the ICE local session password
+  an integer giving the number of components in the ICE session
+  an array of string, of the previous number entries, where each string describe the ICE candidate, formated as an "a=" line (without the "a=" header) described in [rfc5245, section 4.3](https://tools.ietf.org/html/rfc5245#page-26)

-   **Sending the Answer**

Upon reception of the encrypted and signed Initial ICE Offer (through
the listen operation), a called device should perform authorization
checks of the calling device, identified as the Initial Offer signer.
Authorization rules are implementation defined, but a typical
implementation would authorize known or trusted contacts.

If the calling device is not authorized or if for any implementation
defined reason the called device refuses the incoming connection
request, the called device must ignore the Initial Offer and may log the
event.

If the called device authorizes the caller and wish to accept the
connection it must build an ICE answer, start the ICE negotiation
process and send the encrypted and signed ICE answer at the same DHT
key.

### DTLS negotiation

Once a peer-to-peer communication channel has been established, the
called device listens on it for incoming DTLS connections (acting as a
DTLS server) while the caller initiates an outgoing DTLS connection
(acting as a DTLS client).

The DTLS communication must be RFC6347 compliant
([1](https://tools.ietf.org/html/rfc6347)).

Peers must only support PFS cypher suites. The set of supported cypher
suites is implementation defined but should include at least
ECDHE-AES-GCM (TODO: specify the exact suites recommended to support).

During the DTLS handshake, both peers must provide their respective
device certificate chain and must authenticate the other peer, checking
that its public key is the same used during the DHT ICE exchange.

### SIP call

*See [Important\_RFC](Important_RFC "wikilink")*

Once an encrypted and authenticated peer-to-peer communication channel
is available, the SIP protocol [2](https://tools.ietf.org/html/rfc3261)
must be used to place a call and send messages. The caller might send a
SIP INVITE as soon as the DTLS channel is established.

The SIP implementation must support ICE and SRTP.

Supported codecs are implementation defined, but Jami clients should
support the Opus audio coded and the H264 video codec.

SRTP must be used when negotiating media with SIP, using a new random
key for each media and each negotiation. ICE should be used when
negotiating media with SIP.

Cryptographic primitives
------------------------

### Password stretching

*See [Argon2
specifications](https://github.com/P-H-C/phc-winner-argon2/blob/master/argon2-specs.pdf)*

Passwords are stretched using argon2i using t\_cost = 16, m\_cost =
2\^16 (64 MiB), mono-threaded, to generate a 512 bits hash.

The result is then hashed again using SHA{1, 256, 512} depending on the
requested key size.

### Encryption

##### Using a provided key (128, 192 or 256 bits)

Encryption uses standard AES-GCM as implemented by Nettle using a random
IV for each encryption.

##### Using a text password

The password is stretched to generate a 256 bits key and a random salt
of 128 bits.

The input data is encrypted using AES-GCM (see above) and the salt is
appended at the beginning of the resulting cypher-text.

##### During a call

Audio/video data are exchanged using encrypted RTP channels between
peers.

The protocol is a classic SRTP, with following supported crypto suites:

-   Jami account force AES\_CM\_128\_HMAC\_SHA1\_80
-   SIP can use AES\_CM\_128\_HMAC\_SHA1\_80 or
    AES\_CM\_128\_HMAC\_SHA1\_32

The master key and salt is a random number, different for each call. On
call's master key is constant during the full live of a call.

The keys are exchanged using SDES method: keys are written into the SIP
SDP messages during the SIP INVITE negotiation. When SDES is used, Ring
forces the underlaying transport to be secure (encrypted) to not
disclose these keys. Jami supports DTLS natively for SIP and Ring
accounts for such. The call cannot be done if this condition is not
fulfilled.