2015-03-26 00:02:57 +00:00
|
|
|
# Magic-Wormhole
|
2015-02-10 09:39:17 +00:00
|
|
|
|
2017-03-06 18:49:11 +00:00
|
|
|
This library provides a mechanism to securely transfer small amounts
|
2015-02-10 09:39:17 +00:00
|
|
|
of data between two computers. Both machines must be connected to the
|
|
|
|
internet, but they do not need to have public IP addresses or know how to
|
|
|
|
contact each other ahead of time.
|
|
|
|
|
2017-03-06 18:49:11 +00:00
|
|
|
Security and connectivity is provided by means of an "wormhole code": a short
|
|
|
|
string that is transcribed from one machine to the other by the users at the
|
|
|
|
keyboard. This works in conjunction with a baked-in "rendezvous server" that
|
|
|
|
relays information from one machine to the other.
|
2015-02-10 09:39:17 +00:00
|
|
|
|
2016-05-12 22:42:40 +00:00
|
|
|
The "Wormhole" object provides a secure record pipe between any two programs
|
|
|
|
that use the same wormhole code (and are configured with the same application
|
|
|
|
ID and rendezvous server). Each side can send multiple messages to the other,
|
|
|
|
but the encrypted data for all messages must pass through (and be temporarily
|
|
|
|
stored on) the rendezvous server, which is a shared resource. For this
|
|
|
|
reason, larger data (including bulk file transfers) should use the Transit
|
|
|
|
class instead. The Wormhole object has a method to create a Transit object
|
2017-03-06 18:49:11 +00:00
|
|
|
for this purpose. In the future, Transit will be deprecated, and this
|
|
|
|
functionality will be incorporated directly as a "dilated wormhole".
|
2016-05-12 22:42:40 +00:00
|
|
|
|
2017-03-06 18:49:11 +00:00
|
|
|
A quick example:
|
2016-05-12 22:42:40 +00:00
|
|
|
|
2017-03-06 18:49:11 +00:00
|
|
|
```python
|
|
|
|
import wormhole
|
|
|
|
from twisted.internet.defer import inlineCallbacks
|
|
|
|
|
|
|
|
@inlineCallbacks
|
|
|
|
def go():
|
|
|
|
w = wormhole.create(appid, relay_url, reactor)
|
|
|
|
w.generate_code()
|
|
|
|
code = yield w.when_code()
|
|
|
|
print "code:", code
|
|
|
|
w.send(b"outbound data")
|
|
|
|
inbound = yield w.when_received()
|
|
|
|
yield w.close()
|
|
|
|
```
|
2016-05-12 22:42:40 +00:00
|
|
|
|
2017-03-06 18:49:11 +00:00
|
|
|
## Modes
|
2016-05-12 22:42:40 +00:00
|
|
|
|
2017-03-06 18:49:11 +00:00
|
|
|
The API comes in two flavors: Delegated and Deferred. Controlling the
|
|
|
|
Wormhole and sending data is identical in both, but they differ in how
|
|
|
|
inbound data and events are delivered to the application.
|
2015-02-10 09:39:17 +00:00
|
|
|
|
2017-03-06 18:49:11 +00:00
|
|
|
In Delegated mode, the Wormhole is given a "delegate" object, on which
|
|
|
|
certain methods will be called when information is available (e.g. when the
|
|
|
|
code is established, or when data messages are received). In Deferred mode,
|
|
|
|
the Wormhole object has methods which return Deferreds that will fire at
|
|
|
|
these same times.
|
2015-03-26 00:02:57 +00:00
|
|
|
|
2017-03-06 18:49:11 +00:00
|
|
|
Delegated mode:
|
2015-02-10 09:39:17 +00:00
|
|
|
|
|
|
|
```python
|
2017-03-06 18:49:11 +00:00
|
|
|
class MyDelegate:
|
|
|
|
def wormhole_got_code(self, code):
|
|
|
|
print("code: %s" % code)
|
|
|
|
def wormhole_received(self, data): # called for each message
|
|
|
|
print("got data, %d bytes" % len(data))
|
|
|
|
|
|
|
|
w = wormhole.create(appid, relay_url, reactor, delegate=MyDelegate())
|
|
|
|
w.generate_code()
|
2015-02-10 09:39:17 +00:00
|
|
|
```
|
|
|
|
|
2017-03-06 18:49:11 +00:00
|
|
|
Deferred mode:
|
2015-06-21 02:03:10 +00:00
|
|
|
|
2015-02-10 09:39:17 +00:00
|
|
|
```python
|
2017-03-06 18:49:11 +00:00
|
|
|
w = wormhole.create(appid, relay_url, reactor)
|
|
|
|
w.generate_code()
|
|
|
|
def print_code(code):
|
|
|
|
print("code: %s" % code)
|
|
|
|
w.when_code().addCallback(print_code)
|
|
|
|
def received(data):
|
|
|
|
print("got data, %d bytes" % len(data))
|
|
|
|
w.when_received().addCallback(received) # gets exactly one message
|
2015-02-10 09:39:17 +00:00
|
|
|
```
|
|
|
|
|
|
|
|
## Application Identifier
|
|
|
|
|
2015-03-26 00:02:57 +00:00
|
|
|
Applications using this library must provide an "application identifier", a
|
2015-10-07 00:02:52 +00:00
|
|
|
simple string that distinguishes one application from another. To ensure
|
|
|
|
uniqueness, use a domain name. To use multiple apps for a single domain,
|
|
|
|
append a URL-like slash and path, like `example.com/app1`. This string must
|
|
|
|
be the same on both clients, otherwise they will not see each other. The
|
|
|
|
invitation codes are scoped to the app-id. Note that the app-id must be
|
|
|
|
unicode, not bytes, so on python2 use `u"appid"`.
|
2015-02-10 09:39:17 +00:00
|
|
|
|
|
|
|
Distinct app-ids reduce the size of the connection-id numbers. If fewer than
|
2016-05-12 22:42:40 +00:00
|
|
|
ten Wormholes are active for a given app-id, the connection-id will only need
|
|
|
|
to contain a single digit, even if some other app-id is currently using
|
2015-02-10 09:39:17 +00:00
|
|
|
thousands of concurrent sessions.
|
|
|
|
|
2017-03-06 18:49:11 +00:00
|
|
|
## Rendezvous Servers
|
2015-03-26 00:02:57 +00:00
|
|
|
|
2017-03-06 18:49:11 +00:00
|
|
|
The library depends upon a "rendezvous server", which is a service (on a
|
2015-03-26 00:02:57 +00:00
|
|
|
public IP address) that delivers small encrypted messages from one client to
|
|
|
|
the other. This must be the same for both clients, and is generally baked-in
|
|
|
|
to the application source code or default config.
|
2015-02-10 09:39:17 +00:00
|
|
|
|
2017-03-06 18:49:11 +00:00
|
|
|
This library includes the URL of a public rendezvous server run by the
|
|
|
|
author. Application developers can use this one, or they can run their own
|
|
|
|
(see the `wormhole-server` command and the `src/wormhole/server/` directory)
|
|
|
|
and configure their clients to use it instead. This URL is passed as a
|
|
|
|
unicode string. Note that because the server actually speaks WebSockets, the
|
|
|
|
URL starts with `ws:` instead of `http:`.
|
|
|
|
|
|
|
|
## Wormhole Parameters
|
|
|
|
|
|
|
|
All wormholes must be created with at least three parameters:
|
|
|
|
|
|
|
|
* `appid`: a (unicode) string
|
|
|
|
* `relay_url`: a (unicode) string
|
|
|
|
* `reactor`: the Twisted reactor object
|
|
|
|
|
|
|
|
In addition to these three, the `wormhole.create()` function takes several
|
|
|
|
optional arguments:
|
|
|
|
|
|
|
|
* `delegate`: provide a Delegate object to enable "delegated mode", or pass
|
|
|
|
None (the default) to get "deferred mode"
|
|
|
|
* `journal`: provide a Journal object to enable journaled mode. See
|
|
|
|
journal.md for details. Note that journals only work with delegated mode,
|
|
|
|
not with deferred mode.
|
|
|
|
* `tor_manager`: to enable Tor support, create a `wormhole.TorManager`
|
|
|
|
instance and pass it here. This will hide the client's IP address by
|
|
|
|
proxying all connections (rendezvous and transit) through Tor. It also
|
|
|
|
enables connecting to Onion-service transit hints, and (in the future) will
|
|
|
|
enable the creation of Onion-services for transit purposes.
|
|
|
|
* `timing`: this accepts a DebugTiming instance, mostly for internal
|
|
|
|
diagnostic purposes, to record the transmit/receive timestamps for all
|
|
|
|
messages. The `wormhole --dump-timing=` feature uses this to build a
|
|
|
|
JSON-format data bundle, and the `misc/dump-timing.py` tool can build a
|
|
|
|
scrollable timing diagram from these bundles.
|
|
|
|
* `welcome_handler`: this is a function that will be called when the
|
|
|
|
Rendezvous Server's "welcome" message is received. It is used to display
|
|
|
|
important server messages in an application-specific way.
|
|
|
|
* `app_versions`: this can accept a dictionary (JSON-encodable) of data that
|
|
|
|
will be made available to the peer via the `got_version` event. This data
|
|
|
|
is delivered before any data messages, and can be used to indicate peer
|
|
|
|
capabilities.
|
|
|
|
|
|
|
|
## Code Management
|
|
|
|
|
|
|
|
Each wormhole connection is defined by a shared secret "wormhole code". These
|
|
|
|
codes can be generated offline (by picking a unique number and some secret
|
|
|
|
words), but are more commonly generated by whoever creates the first
|
|
|
|
wormhole. In the "bin/wormhole" file-transfer tool, the default behavior is
|
|
|
|
for the sender to create the code, and for the receiver to type it in.
|
|
|
|
|
|
|
|
The code is a (unicode) string in the form `NNN-code-words`. The numeric
|
|
|
|
"NNN" prefix is the "channel id" or "nameplate", and is a short integer
|
|
|
|
allocated by talking to the rendezvous server. The rest is a
|
|
|
|
randomly-generated selection from the PGP wordlist, providing a default of 16
|
|
|
|
bits of entropy. The initiating program should display this code to the user,
|
|
|
|
who should transcribe it to the receiving user, who gives it to their local
|
|
|
|
Wormhole object by calling `set_code()`. The receiving program can also use
|
|
|
|
`type_code()` to use a readline-based input function: this offers tab
|
|
|
|
completion of allocated channel-ids and known codewords.
|
|
|
|
|
|
|
|
The Wormhole object has three APIs for generating or accepting a code:
|
|
|
|
|
|
|
|
* `w.generate_code(length=2)`: this contacts the Rendezvous Server, allocates
|
|
|
|
a short numeric nameplate, chooses a configurable number of random words,
|
|
|
|
then assembles them into the code
|
|
|
|
* `w.set_code(code)`: this accepts the code as an argument
|
|
|
|
* `helper = w.type_code()`: this facilitates interactive entry of the code,
|
|
|
|
with tab-completion. The helper object has methods to return a list of
|
|
|
|
viable completions for whatever portion of the code has been entered so
|
|
|
|
far. A convenience wrapper is provided to attach this to the `rlcompleter`
|
|
|
|
function of libreadline.
|
|
|
|
|
|
|
|
No matter which mode is used, the `w.when_code()` Deferred (or
|
|
|
|
`delegate.wormhole_got_code(code)` callback) will fire when the code is
|
|
|
|
known. `when_code` is clearly necessary for `generate_code`, since there's no
|
|
|
|
other way to learn what code was created, but it may be useful in other modes
|
|
|
|
for consistency.
|
|
|
|
|
|
|
|
The code-entry Helper object has the following API:
|
|
|
|
|
|
|
|
* `d = h.get_nameplates()`: returns a Deferred that fires with a list of
|
|
|
|
(string) nameplates. These form the first portion of the wormhole code
|
|
|
|
(e.g. "4" in "4-purple-sausages"). The list is requested from the server
|
|
|
|
when `w.type_code()` is first called, and if the response arrives before
|
|
|
|
`h.get_nameplates()` is called, it will be used without delay. All
|
|
|
|
subsequent calls to `h.get_nameplates()` will provoke a fresh request to
|
|
|
|
the server, so hitting Tab too early won't condemn the client to using a
|
|
|
|
stale list.
|
|
|
|
* `h.set_nameplate(nameplate)`: commit to using a specific nameplate. Once
|
|
|
|
this is called, `h.get_nameplates()` will raise an immediate exception
|
|
|
|
* `completions = h.get_completions_for(prefix)`: given a prefix like "su",
|
|
|
|
this returns (synchronously) a list of strings which are appropriate to
|
|
|
|
append to the prefix (e.g. `["pportive", "rrender", "spicious"]`, for
|
|
|
|
expansion into "supportive", "surrender", and "suspicious". The prefix
|
|
|
|
should not include the nameplate, but *should* include whatever words and
|
|
|
|
hyphens have been typed so far (the default wordlist uses alternate lists,
|
|
|
|
where even numbered words have three syllables, and odd numbered words have
|
|
|
|
two, so the completions depend upon how many words are present, not just
|
|
|
|
the partial last word). E.g. `get_completions_for("pr")` will return
|
|
|
|
`["ocessor", "ovincial", "oximate"]`, while
|
|
|
|
`get_completions_for("opulent-pr")` will return `["eclude", "efer",
|
|
|
|
"eshrunk", "inter", "owler"]`.
|
|
|
|
* `h.set_words(suffix)`: this accepts a string (e.g. "purple-sausages"), and
|
|
|
|
commits to the code. `h.set_nameplate()` must be called before this, and no
|
|
|
|
other methods may be called afterwards. Calling this causes the
|
|
|
|
`w.when_code()` Deferred or corresponding delegate callback to fire, and
|
|
|
|
triggers the wormhole connection process.
|
|
|
|
|
|
|
|
The `rlcompleter` wrapper is a function that knows how to use the code-entry
|
|
|
|
helper to do tab completion of wormhole codes:
|
|
|
|
|
|
|
|
```python
|
|
|
|
from wormhole import create, rlcompleter_helper
|
|
|
|
w = create(appid, relay_url, reactor)
|
|
|
|
rlcompleter_helper("Wormhole code:", w.type_code())
|
|
|
|
d = w.when_code()
|
|
|
|
```
|
|
|
|
|
|
|
|
This helper runs python's `rawinput()` function inside a thread, since
|
|
|
|
`rawinput()` normally blocks.
|
|
|
|
|
|
|
|
The two machines participating in the wormhole setup are not distinguished:
|
|
|
|
it doesn't matter which one goes first, and both use the same Wormhole
|
|
|
|
constructor function. However if `w.generate_code()` is used, only one side
|
|
|
|
should use it.
|
|
|
|
|
|
|
|
## Offline Codes
|
|
|
|
|
|
|
|
In most situations, the "sending" or "initiating" side will call
|
|
|
|
`w.generate_code()` and display the resulting code. The sending human reads
|
|
|
|
it and speaks, types, performs charades, or otherwise transmits the code to
|
|
|
|
the receiving human. The receiving human then types it into the receiving
|
|
|
|
computer, where it either calls `w.set_code()` (if the code is passed in via
|
|
|
|
argv) or `w.type_code()` (for interactive entry).
|
|
|
|
|
|
|
|
Usually one machine generates the code, and a pair of humans transcribes it
|
|
|
|
to the second machine (so `w.generate_code()` on one side, and `w.set_code()`
|
|
|
|
or `w.type_code()` on the other). But it is also possible for the humans to
|
|
|
|
generate the code offline, perhaps at a face-to-face meeting, and then take
|
|
|
|
the code back to their computers. In this case, `w.set_code()` will be used
|
|
|
|
on both sides. It is unlikely that the humans will restrict themselves to a
|
|
|
|
pre-established wordlist when manually generating codes, so the completion
|
|
|
|
feature of `w.type_code()` is not helpful.
|
|
|
|
|
|
|
|
When the humans create an invitation code out-of-band, they are responsible
|
|
|
|
for choosing an unused channel-ID (simply picking a random 3-or-more digit
|
|
|
|
number is probably enough), and some random words. Dice, coin flips, shuffled
|
|
|
|
cards, or repeated sampling of a high-resolution stopwatch are all useful
|
|
|
|
techniques. The invitation code uses the same format either way: channel-ID,
|
|
|
|
a hyphen, and an arbitrary string. There is no need to encode the sampled
|
|
|
|
random values (e.g. by using the Diceware wordlist) unless that makes it
|
|
|
|
easier to transcribe: e.g. rolling 6 dice could result in a code like
|
|
|
|
"913-166532", and flipping 16 coins could result in "123-HTTHHHTTHTTHHTHH".
|
|
|
|
|
|
|
|
## Verifier
|
|
|
|
|
|
|
|
For extra protection against guessing attacks, Wormhole can provide a
|
|
|
|
"Verifier". This is a moderate-length series of bytes (a SHA256 hash) that is
|
|
|
|
derived from the supposedly-shared session key. If desired, both sides can
|
|
|
|
display this value, and the humans can manually compare them before allowing
|
|
|
|
the rest of the protocol to proceed. If they do not match, then the two
|
|
|
|
programs are not talking to each other (they may both be talking to a
|
|
|
|
man-in-the-middle attacker), and the protocol should be abandoned.
|
|
|
|
|
|
|
|
Once retrieved, you can turn this into hex or Base64 to print it, or render
|
|
|
|
it as ASCII-art, etc. Once the users are convinced that `verify()` from both
|
|
|
|
sides are the same, call `send()` to continue the protocol. If you call
|
|
|
|
`send()` before `verify()`, it will perform the complete protocol without
|
|
|
|
pausing.
|
|
|
|
|
|
|
|
## Events
|
|
|
|
|
|
|
|
As the wormhole connection is established, several events may be dispatched
|
|
|
|
to the application. In Delegated mode, these are dispatched by calling
|
|
|
|
functions on the delegate object. In Deferred mode, the application retrieves
|
|
|
|
Deferred objects from the wormhole, and event dispatch is performed by firing
|
|
|
|
those Deferreds.
|
|
|
|
|
|
|
|
* got_code (`yield w.when_code()` / `dg.wormhole_got_code(code)`): fired when the
|
|
|
|
wormhole code is established, either after `w.generate_code()` finishes the
|
|
|
|
generation process, or when the Input Helper returned by `w.type_code()`
|
|
|
|
has been told `h.set_words()`, or immediately after `w.set_code(code)` is
|
|
|
|
called. This is most useful after calling `w.generate_code()`, to show the
|
|
|
|
generated code to the user so they can transcribe it to their peer.
|
|
|
|
* got_verifier (`yield w.when_verifier()` / `dg.wormhole_got_verifier(verf)`:
|
|
|
|
fired when the key-exchange process has completed, and this side has
|
|
|
|
learned the shared key. The "verifier" is a byte string with a hash of the
|
|
|
|
shared session key; clients can compare them (probably as hex) to ensure
|
|
|
|
that they're really talking to each other, and not to a man-in-the-middle.
|
|
|
|
When `got_verifier` happens, this side has not yet seen evidence that the
|
|
|
|
peer has used the correct wormhole code.
|
|
|
|
* got_version (`yield w.when_version()` / `dg.wormhole_got_version(version)`:
|
|
|
|
fired when the VERSION message arrives from the peer. This serves two
|
|
|
|
purposes. The first is that it provide confirmation that the peer (or a
|
|
|
|
man-in-the-middle) has used the correct wormhole code. The second is
|
|
|
|
delivery of the "app_versions" data (passed into `wormhole.create`).
|
|
|
|
* received (`yield w.when_received()` / `dg.wormhole_received(data)`: fired
|
|
|
|
each time a data message arrives from the peer, with the bytestring that
|
|
|
|
the peer passed into `w.send(data)`.
|
|
|
|
* closed (`yield w.close()` / `dg.wormhole_closed(result)`: fired when
|
|
|
|
`w.close()` has finished shutting down the wormhole, which means all
|
|
|
|
nameplates and mailboxes have been deallocated, and the WebSocket
|
|
|
|
connection has been closed. This also fires if an internal error occurs
|
|
|
|
(specifically WrongPasswordError, which indicates that an invalid encrypted
|
|
|
|
message was received), which also shuts everything down. The `result` value
|
|
|
|
is an exception (or Failure) object if the wormhole closed badly, or a
|
|
|
|
string like "happy" if it had no problems before shutdown.
|
|
|
|
|
|
|
|
## Sending Data
|
|
|
|
|
|
|
|
The main purpose of a Wormhole is to send data. At any point after
|
|
|
|
construction, callers can invoke `w.send(data)`. This will queue the message
|
|
|
|
if necessary, but (if all goes well) will eventually result in the peer
|
|
|
|
getting a `received` event and the data being delivered to the application.
|
|
|
|
|
|
|
|
Since Wormhole provides an ordered record pipe, each call to `w.send` will
|
|
|
|
result in exactly one `received` event on the far side. Records are not
|
|
|
|
split, merged, dropped, or reordered.
|
|
|
|
|
|
|
|
Each side can do an arbitrary number of `send()` calls. The Wormhole is not
|
|
|
|
meant as a long-term communication channel, but some protocols work better if
|
|
|
|
they can exchange an initial pair of messages (perhaps offering some set of
|
|
|
|
negotiable capabilities), and then follow up with a second pair (to reveal
|
|
|
|
the results of the negotiation). The Rendezvous Server does not currently
|
|
|
|
enforce any particular limits on number of messages, size of messages, or
|
|
|
|
rate of transmission, but in general clients are expected to send fewer than
|
|
|
|
a dozen messages, of no more than perhaps 20kB in size (remember that all
|
|
|
|
these messages are temporarily stored in a SQLite database on the server). A
|
|
|
|
future version of the protocol may make these limits more explicit, and will
|
|
|
|
allow clients to ask for greater capacity when they connect (probably by
|
|
|
|
passing additional "mailbox attribute" parameters with the
|
|
|
|
`allocate`/`claim`/`open` messages).
|
|
|
|
|
|
|
|
For bulk data transfer, see "transit.md", or the "Dilation" section below.
|
|
|
|
|
|
|
|
## Closing
|
|
|
|
|
|
|
|
When the application is done with the wormhole, it should call `w.close()`,
|
|
|
|
and wait for a `closed` event. This ensures that all server-side resources
|
|
|
|
are released (allowing the nameplate to be re-used by some other client), and
|
|
|
|
all network sockets are shut down.
|
|
|
|
|
|
|
|
In Deferred mode, this just means waiting for the Deferred returned by
|
|
|
|
`w.close()` to fire. In Delegated mode, this means calling `w.close()` (which
|
|
|
|
doesn't return anything) and waiting for the delegate's `wormhole_closed()`
|
|
|
|
method to be called.
|
|
|
|
|
2015-02-11 00:50:32 +00:00
|
|
|
|
2015-09-28 06:09:51 +00:00
|
|
|
## Bytes, Strings, Unicode, and Python 3
|
|
|
|
|
|
|
|
All cryptographically-sensitive parameters are passed as bytes ("str" in
|
|
|
|
python2, "bytes" in python3):
|
|
|
|
|
|
|
|
* verifier string
|
2015-09-28 06:40:00 +00:00
|
|
|
* data in/out
|
|
|
|
* transit records in/out
|
2015-09-28 06:09:51 +00:00
|
|
|
|
2015-10-07 02:42:10 +00:00
|
|
|
Other (human-facing) values are always unicode ("unicode" in python2, "str"
|
|
|
|
in python3):
|
2015-09-28 06:09:51 +00:00
|
|
|
|
|
|
|
* wormhole code
|
2015-10-06 23:52:33 +00:00
|
|
|
* relay URL
|
2015-10-07 02:29:59 +00:00
|
|
|
* transit URLs
|
|
|
|
* transit connection hints (e.g. "host:port")
|
2015-10-07 00:02:52 +00:00
|
|
|
* application identifier
|
2016-05-26 03:58:53 +00:00
|
|
|
* derived-key "purpose" string: `w.derive_key(PURPOSE, LENGTH)`
|