start documenting the protocols
This commit is contained in:
		
							parent
							
								
									ddb83e9d59
								
							
						
					
					
						commit
						9314c6918f
					
				
							
								
								
									
										56
									
								
								docs/introduction.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										56
									
								
								docs/introduction.md
									
									
									
									
									
										Normal file
									
								
							| 
						 | 
					@ -0,0 +1,56 @@
 | 
				
			||||||
 | 
					# Magic-Wormhole
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The magic-wormhole (Python) distribution provides several things: an
 | 
				
			||||||
 | 
					executable tool ("bin/wormhole"), an importable library (`import wormhole`),
 | 
				
			||||||
 | 
					the URL of a publically-available Rendezvous Server, and the definition of a
 | 
				
			||||||
 | 
					protocol used by all three.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The executable tool provides basic sending and receiving of files,
 | 
				
			||||||
 | 
					directories, and short text strings. These all use `wormhole send` and
 | 
				
			||||||
 | 
					`wormhole receive` (which can be abbreviated as `wormhole tx` and `wormhole
 | 
				
			||||||
 | 
					rx`). It also has a mode to facilitate the transfer of SSH keys. This tool,
 | 
				
			||||||
 | 
					while useful on its own, is just one possible use of the protocol.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The `wormhole` library provides an API to establish a bidirectional ordered
 | 
				
			||||||
 | 
					encrypted record pipe to another instance (where each record is an
 | 
				
			||||||
 | 
					arbitrary-sized bytestring). This does not provide file-transfer directly:
 | 
				
			||||||
 | 
					the "bin/wormhole" tool speaks a simple protocol through this record pipe to
 | 
				
			||||||
 | 
					negotiate and perform the file transfer.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					`wormhole/cli/public_relay.py` contains the URLs of a Rendezvous Server and a
 | 
				
			||||||
 | 
					Transit Relay which I provide to support the file-transfer tools, which other
 | 
				
			||||||
 | 
					developers should feel free to use for their applications as well. I cannot
 | 
				
			||||||
 | 
					make any guarantees about performance or uptime for these servers: if you
 | 
				
			||||||
 | 
					want to use Magic Wormhole in a production environment, please consider
 | 
				
			||||||
 | 
					running a server on your own infrastructure (just run `wormhole-server start`
 | 
				
			||||||
 | 
					and modify the URLs in your application to point at it).
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## The Magic-Wormhole Protocol
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					There are several layers to the protocol.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					At the bottom level, each client opens a WebSocket to the Rendezvous Server,
 | 
				
			||||||
 | 
					sending JSON-based commands to the server, and receiving similarly-encoded
 | 
				
			||||||
 | 
					messages. Some of these commands are addressed to the server itself, while
 | 
				
			||||||
 | 
					others are instructions to queue a message to other clients, or are
 | 
				
			||||||
 | 
					indications of messages coming from other clients. All these messages are
 | 
				
			||||||
 | 
					described in "server-protocol.md".
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					These inter-client messages are used to convey the PAKE protocol exchange,
 | 
				
			||||||
 | 
					then a "VERSION" message (which doubles to verify the session key), then some
 | 
				
			||||||
 | 
					number of encrypted application-level data messages. "client-protocol.md"
 | 
				
			||||||
 | 
					describes these wormhole-to-wormhole messages.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Each wormhole-using application is then free to interpret the data messages
 | 
				
			||||||
 | 
					as it pleases. The file-transfer app sends an "offer" from the `wormhole
 | 
				
			||||||
 | 
					send` side, to which the `wormhole receive` side sends a response, after
 | 
				
			||||||
 | 
					which the Transit connection is negotiated (if necessary), and finally the
 | 
				
			||||||
 | 
					data is sent through the Transit connection. "file-transfer-protocol.md"
 | 
				
			||||||
 | 
					describes this application's use of the client messages.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## The `wormhole` API
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Application use the `wormhole` library to establish wormhole connections and
 | 
				
			||||||
 | 
					exchange data through them. Please see `api.md` for a complete description of
 | 
				
			||||||
 | 
					this interface.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
							
								
								
									
										235
									
								
								docs/server-protocol.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										235
									
								
								docs/server-protocol.md
									
									
									
									
									
										Normal file
									
								
							| 
						 | 
					@ -0,0 +1,235 @@
 | 
				
			||||||
 | 
					# Rendezvous Server Protocol
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Concepts
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The Rendezvous Server provides queued delivery of binary messages from one
 | 
				
			||||||
 | 
					client to a second, and vice versa. Each message contains a "phase" (a
 | 
				
			||||||
 | 
					string) and a body (bytestring). These messages are queued in a "Mailbox"
 | 
				
			||||||
 | 
					until the other side connects and retrieves them, but are delivered
 | 
				
			||||||
 | 
					immediately if both sides are connected to the server at the same time.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Mailboxes are identified by a large random string. "Nameplates", in contrast,
 | 
				
			||||||
 | 
					have short numeric identities: in a wormhole code like "4-purple-sausages",
 | 
				
			||||||
 | 
					the "4" is the nameplate.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Each client has a randomly-generated "side", a short hex string, used to
 | 
				
			||||||
 | 
					differentiate between echoes of a client's own message, and real messages
 | 
				
			||||||
 | 
					from the other client.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Application IDs
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The server isolates each application from the others. Each client provides an
 | 
				
			||||||
 | 
					"App Id" when it first connects (via the "BIND" message), and all subsequent
 | 
				
			||||||
 | 
					commands are scoped to this application. This means that nameplates
 | 
				
			||||||
 | 
					(described below) and mailboxes can be re-used between different apps. The
 | 
				
			||||||
 | 
					AppID is a unicode string. Both sides of the wormhole must use the same
 | 
				
			||||||
 | 
					AppID, of course, or they'll never see each other. The server keeps track of
 | 
				
			||||||
 | 
					which applications are in use for maintenance purposes.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Each application should use a unique AppID. Developers are encouraged to use
 | 
				
			||||||
 | 
					"DNSNAME/APPNAME" to obtain a unique one: e.g. the `bin/wormhole`
 | 
				
			||||||
 | 
					file-transfer tool uses `lothar.com/wormhole/text-or-file-xfer`.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## WebSocket Transport
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					At the lowest level, each client establishes (and maintains) a WebSocket
 | 
				
			||||||
 | 
					connection to the Rendezvous Server. If the connection is lost (which could
 | 
				
			||||||
 | 
					happen because the server was rebooted for maintenance, or because the
 | 
				
			||||||
 | 
					client's network connection migrated from one network to another, or because
 | 
				
			||||||
 | 
					the resident network gremlins decided to mess with you today), clients should
 | 
				
			||||||
 | 
					reconnect after waiting a random (and exponentially-growing) delay. The
 | 
				
			||||||
 | 
					Python implementation waits about 1 second after the first connection loss,
 | 
				
			||||||
 | 
					growing by 50% each time, capped at 1 minute.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Each message to the server is a dictionary, with at least a `type` key, and
 | 
				
			||||||
 | 
					other keys that depend upon the particular message type. Messages from server
 | 
				
			||||||
 | 
					to client follow the same format.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					`misc/dump-timing.py` is a debug tool which renders timing data gathered from
 | 
				
			||||||
 | 
					the server and both clients, to identify protocol slowdowns and guide
 | 
				
			||||||
 | 
					optimization efforts. To support this, the client/server messages include
 | 
				
			||||||
 | 
					additional keys. Client->Server messages include a random `id` key, which is
 | 
				
			||||||
 | 
					copied into the `ack` that is immediately sent back to the client for all
 | 
				
			||||||
 | 
					commands (and is ignored except for the timing tool). Some client->server
 | 
				
			||||||
 | 
					messages (`list`, `allocate`, `claim`, `release`, `close`, `ping`) provoke a
 | 
				
			||||||
 | 
					direct response by the server: for these, `id` is copied into the response.
 | 
				
			||||||
 | 
					This helps the tool correlate the command and response. All server->client
 | 
				
			||||||
 | 
					messages have a `server_tx` timestamp (seconds since epoch, as a float),
 | 
				
			||||||
 | 
					which records when the message left the server. Direct responses include a
 | 
				
			||||||
 | 
					`server_rx` timestamp, to record when the client's command was received. The
 | 
				
			||||||
 | 
					tool combines these with local timestamps (recorded by the client and not
 | 
				
			||||||
 | 
					shared with the server) to build a full picture of network delays and
 | 
				
			||||||
 | 
					round-trip times.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					All messages are serialized as JSON, encoded to UTF-8, and the resulting
 | 
				
			||||||
 | 
					bytes sent as a single "binary-mode" WebSocket payload.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Servers can signal `error` for any message type it does not recognize.
 | 
				
			||||||
 | 
					Clients and Servers must ignore unrecognized keys in otherwise-recognized
 | 
				
			||||||
 | 
					messages.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Connection-Specific (Client-to-Server) Messages
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The first thing each client sends to the server, immediately after the
 | 
				
			||||||
 | 
					WebSocket connection is established, is a `bind` message. This specifies the
 | 
				
			||||||
 | 
					AppID and side (in keys `appid` and `side`, respectively) that all subsequent
 | 
				
			||||||
 | 
					messages will be scoped to. While technically each message could be
 | 
				
			||||||
 | 
					independent, I thought it would be less confusing to use exactly one
 | 
				
			||||||
 | 
					WebSocket per logical wormhole connection.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The first thing the server sends to each client is the `welcome` message.
 | 
				
			||||||
 | 
					This is intended to deliver important status information to the client that
 | 
				
			||||||
 | 
					might influence its operation. The Python client currently reacts to the
 | 
				
			||||||
 | 
					following keys (and ignores all others):
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					* `current_cli_version`: prompts the user to upgrade if the server's
 | 
				
			||||||
 | 
					  advertised version is greater than the client's version (as derived from
 | 
				
			||||||
 | 
					  the git tag)
 | 
				
			||||||
 | 
					* `motd`: prints this message, if present; intended to inform users about
 | 
				
			||||||
 | 
					  performance problems, scheduled downtime, or to beg for donations to keep
 | 
				
			||||||
 | 
					  the server running
 | 
				
			||||||
 | 
					* `error`: causes the client to print the message and then terminate. If a
 | 
				
			||||||
 | 
					  future version of the protocol requires a rate-limiting CAPTCHA ticket or
 | 
				
			||||||
 | 
					  other authorization record, the server can send `error` (explaining the
 | 
				
			||||||
 | 
					  requirement) if it does not see this ticket arrive before the `bind`.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					A `ping` will provoke a `pong`: these are only used by unit tests for
 | 
				
			||||||
 | 
					synchronization purposes (to detect when a batch of messages have been fully
 | 
				
			||||||
 | 
					processed by the server). NAT-binding refresh messages are handled by the
 | 
				
			||||||
 | 
					WebSocket layer (by asking Autobahn to send a keepalive messages every 60
 | 
				
			||||||
 | 
					seconds), and do not use `ping`.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					If any client->server command is invalid (e.g. it lacks a necessary key, or
 | 
				
			||||||
 | 
					was sent in the wrong order), an `error` response will be sent, This response
 | 
				
			||||||
 | 
					will include the error string in the `error` key, and a full copy of the
 | 
				
			||||||
 | 
					original message dictionary in `orig`.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Nameplates
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Wormhole codes look like `4-purple-sausages`, consisting of a number followed
 | 
				
			||||||
 | 
					by some random words. This number is called a "Nameplate".
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					On the Rendezvous Server, the Nameplate contains a pointer to a Mailbox.
 | 
				
			||||||
 | 
					Clients can "claim" a nameplate, and then later "release" it. Each claim is
 | 
				
			||||||
 | 
					for a specific side (so one client claiming the same nameplate multiple times
 | 
				
			||||||
 | 
					only counts as one claim). Nameplates are deleted once the last client has
 | 
				
			||||||
 | 
					released it, or after some period of inactivity.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Clients can either make up nameplates themselves, or (more commonly) ask the
 | 
				
			||||||
 | 
					server to allocate one for them. Allocating a nameplate automatically claims
 | 
				
			||||||
 | 
					it (to avoid a race condition), but for simplicity, clients send a claim for
 | 
				
			||||||
 | 
					all nameplates, even ones which they've allocated themselves.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Nameplates (on the server) must live until the second client has learned
 | 
				
			||||||
 | 
					about the associated mailbox, after which point they can be reused by other
 | 
				
			||||||
 | 
					clients. So if two clients connect quickly, but then maintain a long-lived
 | 
				
			||||||
 | 
					wormhole connection, the do not need to consume the limited spare of short
 | 
				
			||||||
 | 
					nameplates for that whole time.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The `allocate` command allocates a nameplate (the server returns one that is
 | 
				
			||||||
 | 
					as short as possible), and the `allocated` response provides the answer.
 | 
				
			||||||
 | 
					Clients can also send a `list` command to get back a `nameplates` response
 | 
				
			||||||
 | 
					with all allocated nameplates for the bound AppID: this helps the code-input
 | 
				
			||||||
 | 
					tab-completion feature know which prefixes to offer. The `nameplates`
 | 
				
			||||||
 | 
					response returns a list of dictionaries, one per claimed nameplate, with at
 | 
				
			||||||
 | 
					least an `id` key in each one (with the nameplate string). Future versions
 | 
				
			||||||
 | 
					may record additional attributes in the nameplate records.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Mailboxes
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The server provides a single "Mailbox" to each pair of connecting Wormhole
 | 
				
			||||||
 | 
					clients. This holds an unordered set of messages, delivered immediately to
 | 
				
			||||||
 | 
					connected clients, and queued for delivery to clients which connect later.
 | 
				
			||||||
 | 
					Messages from both clients are merged together: clients use the included
 | 
				
			||||||
 | 
					`side` identifier to distinguish echoes of their own messages from those
 | 
				
			||||||
 | 
					coming from the other client.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Each mailbox is "opened" by some number of clients at a time, until all
 | 
				
			||||||
 | 
					clients have closed it. Mailboxes are kept alive by either an open client, or
 | 
				
			||||||
 | 
					a Nameplate which points to the mailbox (so when a Nameplate is deleted from
 | 
				
			||||||
 | 
					inactivity, the corresponding Mailbox will be too).
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The `open` command both marks the mailbox as being opened by the bound side,
 | 
				
			||||||
 | 
					and also adds the WebSocket as subscribed to that mailbox, so new messages
 | 
				
			||||||
 | 
					are delivered immediately to the connected client. There is no explicit ack
 | 
				
			||||||
 | 
					to the `open` command, but since all clients add a message to the mailbox as
 | 
				
			||||||
 | 
					soon as they connect, there will always be a `message` reponse shortly after
 | 
				
			||||||
 | 
					the `open` goes through. The `close` command provokes a `closed` response.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The `close` command accepts an optional "mood" string: this allows clients to
 | 
				
			||||||
 | 
					tell the server (in general terms) about their experiences with the wormhole
 | 
				
			||||||
 | 
					interaction. The server records the mood in its "usage" record, so the server
 | 
				
			||||||
 | 
					operator can get a sense of how many connections are succeeding and failing.
 | 
				
			||||||
 | 
					The moods currently recognized by the Rendezvous Server are:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					* happy (default): the PAKE key-establishment worked, and the client saw a
 | 
				
			||||||
 | 
					  valid encrypted message from its peer
 | 
				
			||||||
 | 
					* lonely: the client gave up without hearing anything from its peer
 | 
				
			||||||
 | 
					* scary: the client saw an invalid encrypted message from its peer,
 | 
				
			||||||
 | 
					  indicating that either the wormhole code was typed in wrong, or an attacker
 | 
				
			||||||
 | 
					  tried (and failed) to guess the code
 | 
				
			||||||
 | 
					* errory: the client encountered some other error: protocol problem or
 | 
				
			||||||
 | 
					  internal error
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The server will also record "pruney" if it deleted the mailbox due to
 | 
				
			||||||
 | 
					inactivity, or "crowded" if more than two sides tried to access the mailbox.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					When clients use the `add` command to add a client-to-client message, they
 | 
				
			||||||
 | 
					will put the body (a bytestring) into the command as a hex-encoded string in
 | 
				
			||||||
 | 
					the `body` key. They will also put the message's "phase", as a string, into
 | 
				
			||||||
 | 
					the `phase` key. See client-protocol.md for details about how different
 | 
				
			||||||
 | 
					phases are used.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					When a client sends `open`, it will get back a `message` response for every
 | 
				
			||||||
 | 
					message in the mailbox. It will also get a real-time `message` for every
 | 
				
			||||||
 | 
					`add` performed by clients later. These `message` responses include "side"
 | 
				
			||||||
 | 
					and "phase" from the sending client, and "body" (as a hex string, encoding
 | 
				
			||||||
 | 
					the binary message body). The decoded "body" will either by a random-looking
 | 
				
			||||||
 | 
					cryptographic value (for the PAKE message), or a random-looking encrypted
 | 
				
			||||||
 | 
					blob (for the VERSION message, as well as all application-provided payloads).
 | 
				
			||||||
 | 
					The `message` response will also include `id`, copied from the `id` of the
 | 
				
			||||||
 | 
					`add` message (and used only by the timing-diagram tool).
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The Rendezvous Server does not de-duplicate messages, nor does it retain
 | 
				
			||||||
 | 
					ordering: clients must do both if they need to.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## All Message Types
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					This lists all message types, along with the type-specific keys for each (if
 | 
				
			||||||
 | 
					any), and which ones provoke direct responses:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					* S->C welcome {welcome:}
 | 
				
			||||||
 | 
					* (C->S) bind {appid:, side:}
 | 
				
			||||||
 | 
					* (C->S) list {} -> nameplates
 | 
				
			||||||
 | 
					* S->C nameplates {nameplates: [{id: str},..]}
 | 
				
			||||||
 | 
					* (C->S) allocate {} -> allocated
 | 
				
			||||||
 | 
					* S->C allocated {nameplate:}
 | 
				
			||||||
 | 
					* (C->S) claim {nameplate:} -> claimed
 | 
				
			||||||
 | 
					* S->C claimed {mailbox:}
 | 
				
			||||||
 | 
					* (C->S) release {nameplate:?} -> released
 | 
				
			||||||
 | 
					* S->C released
 | 
				
			||||||
 | 
					* (C->S) open {mailbox:}
 | 
				
			||||||
 | 
					* (C->S) add {phase: str, body: hex} -> message (to all connected clients)
 | 
				
			||||||
 | 
					* S->C message {side:, phase:, body:, id:}
 | 
				
			||||||
 | 
					* (C->S) close {mailbox:?, mood:?} -> closed
 | 
				
			||||||
 | 
					* S->C closed
 | 
				
			||||||
 | 
					* S->C ack
 | 
				
			||||||
 | 
					* (C->S) ping {ping: int} -> ping
 | 
				
			||||||
 | 
					* S->C pong {pong: int}
 | 
				
			||||||
 | 
					* S->C error {error: str, orig:}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					# Persistence
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The server stores all messages in a database, so it should not lose any
 | 
				
			||||||
 | 
					information when it is restarted. The server will not send a direct
 | 
				
			||||||
 | 
					response until any side-effects (such as the message being added to the
 | 
				
			||||||
 | 
					mailbox) being safely committed to the database.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The client library knows how to resume the protocol after a reconnection
 | 
				
			||||||
 | 
					event, assuming the client process itself continues to run.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Clients which terminate entirely between messages (e.g. a secure chat
 | 
				
			||||||
 | 
					application, which requires multiple wormhole messages to exchange
 | 
				
			||||||
 | 
					address-book entries, and which must function even if the two apps are never
 | 
				
			||||||
 | 
					both running at the same time) can use "Journal Mode" to ensure forward
 | 
				
			||||||
 | 
					progress is made: see "api.md" (Journal Mode) for details.
 | 
				
			||||||
		Loading…
	
		Reference in New Issue
	
	Block a user