ntoll.org

(Everything I say is false...)
home | about | articles | cv | contact

Drogulus - Questions and Clarifications

Wednesday 22nd May 2013 (8:00AM)

Last weekend I gave a very short (15 minute) talk on the drogulus: a programmable peer-to-peer data store that I've been working on in my own time. Pretty much all my answers in the short Q/A that followed were a variation of "I don't know". I consider this a success since it provided evidence for future avenues of investigation that others have proposed about the drogulus (one of the purposes for giving the talk).

I prefer to say "I don't know" then think about the problem and write a considered response. I've had a couple of days to ponder the questions and comments from the Q/A (and later discussions in the bar), and in this post I'll attempt to answer, clarify or admit that, even upon reflection, I still don't know.

I'll assume you've read the blog post of the talk before reading the following.

What about flooding the network with Logos jobs? (or) How do you solve the problem of the tragedy of the commons?

Upon reflection I think this is a solvable problem. Put simply, there must be costs for misbehaviour and rewards for collaboration. I think a mechanism that uses both carrot and stick could be a solution (note that I didn't say, is the answer). My guess is that the specifics of a solution will become clearer if/when the drogulus gets used.

The drogulus is a peer-to-peer system: by virtue of the way the system works peers have evidence of how each other behave. The ultimate punishment is to ostracise misbehaving nodes from the network, cutting them off from data and the latent computing power within the drogulus.

Therefore, in order to run Logos jobs, nodes must have shown evidence that they are "good citizens" of the network. Furthermore, there is the threat that evidence of "bad" behaviour will result in punishment.

To some extent this feature already exists within the drogulus: every node maintains a simple data structure called a routing table - the means of keeping track of peers on the network. To get in to another node's routing table you must have been in contact with the node and provided some useful information in a timely fashion. The number of available slots in the routing table is limited by a constant called K. Only the most reliable nodes get included in node's routing tables and those that do not maintain good performance are removed and quickly replaced.

In this way, the distributed hash table's nodes attempt to use the most reliable peers to maintain the system's performance. Furthermore, if a node is found to propagate a value that fails the cryptographic checks it is immediately removed from routing tables no matter how reliable its prior performance.

Something similar could be achieved for running Logos scripts. For example, peers may only run Logos jobs from remote nodes that have already run a Logos job for them (you scratch my back, I'll scratch yours) or from nodes that have existed within the routing table and fulfilled a certain number of successful interactions with the local node.

My aim is simply to think up a mechanism by which it costs nothing to be a good citizen yet is fatally expensive to be disruptive.

What happens if a third party attempts to block by IP address?

As I mentioned in the talk, areas of the key space are covered by many different nodes. The IP address of a node has nothing to do with the key space it covers. I presume a third party would be attempting to block access to a key/value item stored in the drogulus rather than a specific machine. To block an area of the key space a third party would have to take down all nodes containing the target key/value item.

Unfortunately, this is easier said than done because:

I imagine some of the properties of the drogulus are like a swarming flock of starlings: a dynamic system consisting of a multitude of independent parts that are constantly acting on and reacting to each other.

Swarm of starlings

What is the etymology of "drogulus" and "logos"?

A drogulus is an entity whose presence is unverifiable, because it has no physical effects. The atheist philosopher A.J.Ayer coined it as a way of ridiculing the belief system of his friend, the Jesuit philosopher, Frederick Copleston.

In 1949 Ayer and Copleston took part in a radio debate about the existence of God. The debate then went back and forth, until Ayer came up with the following as a way of illustrating the point that Copleston's metaphysics had no content because there was no way of testing the truth of metaphysical assertions. He said:

"I say, 'There's a "drogulus" over there,' and you say, 'What?' and I say, 'drogulus' and you say 'What's a drogulus?' Well, I say, 'I can't describe what a drogulus is, because it's not the sort of thing you can see or touch, it has no physical effects of any kind, but it's a disembodied being.' And you say, 'Well how am I to tell if it's there or it's not there?' and I say, 'There's no way of telling. Everything's just the same if it's there or it's not there. But the fact is it's there. There's a drogulus there standing just behind you, spiritually behind you.' Does that makes sense?"

Of course, the natural answer Ayer was waiting for was "No, of course it doesn't make sense." Therefore, the implication would be that metaphysics is like the "drogulus" ~ a being which cannot be seen and has no perceptible effects. If Ayer can get to that point, he can claim that any kind of belief in the Christian God or in metaphysical principles in general is really contrary to our logical and scientific understanding of the world.

This appeals greatly to my sense of humour and I've always thought it'd be a fun name for a software project. Especially a project like this one. :-)

Portrait of A.J.Ayer
A.J.Ayer

Logos (λόγος) is a term used by the pre-Socratic philosopher, Heraclitus, to mean several different things: account, explanation, reason, organising principle, wisdom, nature or saying. It's the etymological root of the modern English word "logic". I do not use it in the biblical sense where it means the word of God (I refer readers to the explanation of "drogulus" above).

It seems to me an appropriate choice of name for a computer language.

It's a complex problem and you don't know what you're doing!

I won't contest that!

It's a fun personal project. If it is useful then people will use it. At the very least it's a helpful learning exercise for me (which is, in itself, a positive outcome).

However, a complex problem may not entail a complex solution. Rather, it only needs to work. Furthermore, while thinking about the drogulus I've attempted to work out the simplest possible solution given whatever the abstract problem I've needed to solve.

As computing pioneer Tony Hoare explains,

"There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult."

I'm aiming for Hoare's first method of constructing software.

What's being sent down the wire?

Dictionary like objects (that are themselves valid statements in Logos) are encoded using msgpack and sent as a netstring to remote nodes over TCP/IP.

How does the cryptographic signing work?

The relevant code can be found in the crypto.py module. I use the popular PyCrypto library for all cryptographic functionality.

Put simply, each item encompassing a key/value pair includes the following fields:

The public_key field is used to validate the sig value. If this is OK then the compound key is checked using the obviously valid public_key and name fields.

This ensures both the provenance of the data and that it hasn't been tampered with.

Any items that don't pass the cryptographic checks are ignored and nodes that propagate them are punished by being blocked.

How is it licensed?

Under the GNU Affero general public license version 3. I usually license my code under a more liberal license (e.g. the MIT license) but decided to use the AGPLv3 because this is the reference version of the drogulus and should always remain open (if anything comes of the drogulus, I hope many different implementations will exist).

Do you want or need help?

YES!

Image credits: Swarm of Starlings © 2008 Gail Johnson (under a creative commons license). Portrait of A.J.Ayer sourced from Wikipedia as fair use.


Politics, Programming, Data and the Drogulus

Saturday 18th May 2013 (4:00PM)

(This article is based upon a short talk I gave at Opentech 2013.)

Drogulus logo

I'm going to describe the drogulus: a programmable peer-to-peer data store that I've been hacking on in my spare time.

The problem I have is a growing unease with the current state of the web. This unease can be summed up in three ways:

  1. On the web, users are no longer in control of their data or online identity. They're locked in to websites that act as walled gardens of data each of which requires different credentials in order to log in. Once logged in there is often no way to extract data. Furthermore, how can we tell who's who? Is user X on Twitter the same user X on Facebook?
  2. Programmers have to build stuff on the web using complicated and quirky technology defined in a top down manner by committee. Just watch any web developer grimace if you mention OAuth, CORS, Javascript date objects or (dare I say it) Internet Explorer.
  3. There are many inadvertent points of control, lock-in and authority built in to the web - each of which is a potential mechanism for dis-empowerment and exploitation. Just look at the great firewall of China, the censorship of the Piratebay here in the UK or the kerfuffle over payment to Wikileaks.

The beautifully simple, open and decentralised hypertext system envisioned by Tim Berners-Lee has grown in to a closed, centralised and complicated monster beholden to dodgy commercial, political and legal manipulation. More worryingly still, our data is analyzed by companies, sold via targeted advertising or handed over to governments without our consent.

Unfortunately, many aspects of today's web are contrary to a concept that is very important to me: autonomy.

When someone is autonomous they are self-directing, free to act of their own accord and lack imposition from others. Autonomy also suggests intelligence and awareness enough to be able to enjoy and make use of such freedom. Furthermore, such intelligence entails decision making so people become accountable for their actions. Autonomy is also the opposite of such undesirable states as tyranny, slavery, ignorance and apathy.

I asked myself, how would software designed to promote autonomy function? I started to hack and the drogulus was born.

The drogulus is a speculative exercise in peer-to-peer decentralisation, a political statement (it promotes a certain point of view about technology's role in society) and is a place for me to explore fun ideas that have been knocking around in my head for a while.

I tell myself, "it'll all come to nothing" but I'm having too much fun to stop, so I want to tell you about the drogulus to give you a sense of why I find it so fascinating.

So, how is the drogulus designed to promote digital autonomy?

It's a global federated, decentralized and open data store that can be programmed by anyone. Identity and provenance is ensured by cryptographically signing digital assets.

Being federated (the system consists of many independent but collaborating entities) and decentralized (no entity is more important than the others) ensures users are free from choke points of authority that may be used to control access to data and usage of the system.

Being an open system means all users are free to contribute, change, enhance and expand the system without prejudice.

Being programmable means users can do something with the data stored within the drogulus. It's a sort of distributed programming environment. Imagine it as a re-configurable SETI@home on steroids: by running a node in the drogulus network you are sharing a small amount of your potential computing power with everyone else on the network.

By using public key cryptography the drogulus ensures the provenance of the data and that it has not been tampered with. Importantly, there is no central authority to prove who's who. Identity is built on the far more humane mechanism of a web of trust.

So how does it work? Well, there are three simple core components:

Each component promotes autonomy in the following ways:

A distributed hash table works like a sort of peer-to-peer dictionary: a unique key is used to identify a value. In the case of a traditional dictionary, the key is a word and the value is its definition. Being a data store, the distributed hash table allows users to create, retrieve, update and delete their own keys and associated digital values.

The hash table is distributed because it is split in to the equivalent of many volumes of a traditional dictionary. Each person who ever uses the dictionary has a copy of just one volume with many copies of the same volume being distributed to many different users.

Users keep track of which of their friends on the network hold what volume. Users interact with the distributed hash table by contacting the friend with the correct volume for the desired item. If they don't know anyone with the right volume they play a sort of six-degrees-of-separation game with their friends until someone with the correct volume is found.

Distributed hash tables also share an interesting property with Bittorrent: the more popular an entry is the more widespread it becomes, thus improving performance since popular items are easier to find.

The drogulus implements a version of the Kademlia distributed hash table algorithm. The innovation the drogulus brings is that keys and values are signed in such a way that their provenance can be proven and content shown to be intact. Furthermore, users cannot interfere with each other's items stored within the distributed hash table unless they have access to the same private key.

Items are self contained and any that do not pass the cryptographic checks are ignored and nodes on the network that attempt to propagate such values are punished by being blocked by their peers.

Programming the drogulus is done via Logos, a homoiconic language: this means code and data are the same thing in Logos. This has the interesting mind-bending side effect that Logos programs can rewrite other Logos programs in order to extend the Logos programming language itself. This is an important property: users have the autonomy to grow the Logos programming language to suit their own needs.

Since Logos programs are also data they are stored as values within the distributed hash table so users can re-use each other's code.

They run in asynchronous "ensembles" on other nodes in the drogulus network. The result is delivered when the ensemble eventually arrives at a consensus. To protect peers, Logos programs are sandboxed and intentionally limited in terms of time (how long they can run for) and space (how much memory may be used).

At each point in hacking together the drogulus I've tried to build a solution whose outcomes reflect my ethical and political considerations: a focus on autonomy and openness and the removal of authority and choke points.

Unfortunately, the drogulus is unfinished! Currently, the distributed hash table is almost done, the cryptographic layer is finished and Logos is in the advanced planning stages with some experimental code written.

It's early days and I realise that there are potential contradictions and problems that I've not worked out, nor is there anything useful that can be achieved with the drogulus at this moment in time. Because I'm working at such an abstract level it's hard for me to comprehend what use others may find for a programmable peer-to-peer data store. That's why I wanted to present the drogulus in its current incomplete state: to gauge what sort of reaction (if any) it might get.

I'll finish by pointing out that in 1996 William Gibson described the web as merely the test card for 21st century technology. I'd like to think we can do better than the web. Thinking outside its confines has been a liberating experience and I'd encourage everyone to do the same. Obviously the drogulus is the rough and ready result of such pondering by me.

We'd only have ourselves to blame if we don't imagine and build something better than the test card that is the web. After all, if there's one thing that the web has taught us, it is that engineering software is a far more useful, tangible and easier agent of change than traditional means of political engagement.

The code is on GitHub and I've created a simple website that explains things further. If you have any questions please drop me an email.

Thanks!

EDIT: Most of the technical questions I was asked resulted in me saying, "I don't know". I consider this a good thing. It was interesting that someone picked up on etymological fun I've been having with this project. ;-)

Image credits: © the author.


App for Doctors: Do I Treat This Immigrant?

Saturday 11th May 2013 (6:30PM)

I've just built (in all of 20 minutes) a helpful cross-platform website that automatically detects if a doctor in the UK should treat an immigrant asking for treatment. It's a fully responsive HTML5 application that works with *all* devices that connect to the web (although it's untested on IE6).

The BBC reports that, "Migrants' access to the NHS would be restricted" after the Queen's speech of 8th May 2013. Jeremy Hunt, health secretary, has identified that immigrants are clogging up UK hospitals.

Obviously, this application can't come soon enough:

http://doitreatthisimmigrant.com

The source code can be found here (pull requests welcome!):

For the "BIG DATA" people, there's even a RESTful API: JSON or XML.

View all articles