Road to Tonk Substrate

In the 1980s, getting an answer from your data used to mean expensive business processes, then the first spreadsheet program called VisiCalc changed the paradigm of how people worked with data entirely; it became a dynamic interactive loop with the user at the center. Before, an analyst would write up a request, hand it to the data processing department, wait days for a COBOL batch job to run overnight on the mainframe, and receive a printout only to realize it's the wrong question. With VisiCalc, that same analyst could sit down, change a few numbers and see everything update instantly. A painful cycle that took weeks was now being done in seconds. Such a dramatic speedup wasn't achieved through increasing efficiencies of the business process, but by changing the paradigm of how work happens.

We believe that software is stuck in the mainframe era; existing entirely as business processes with tickets, negotiations with engineering, and CI/CD deployment queues. As LLMs make code generation abundant and cheap, a new era of personal software is opening, but we can't achieve its full promise by speeding up those same expensive, engineering heavy processes. We need a new surface that is flexible, interoperable, and owned by the end user (it's their software, after all). At the same time, the abundance of digital matter created by LLMs threatens to flood out genuine human connection. We need technologies that put human connection first and allow us to have true agency over our digital lives.

Enter the substrate: a software environment whose internal structures are addressable and modifiable by its user. Substrates are malleable, open and horizontally connected as opposed to the stack which is rigid, enclosed and vertically integrated. Substrates make changing software easy. They support a continuous, open-ended potential while retaining ownership and control with the user. Substrates are also highly portable, ensuring they can always travel with you and remain open for easy sharing with whoever you want (and keeping it from whoever you don’t want). In short, it means you're modifying software in the context of its use, rather than through a separate process; it means it’s truly your data, and your software.

Tonk is the name we're giving to our data substrate. We're stewarding Tonk through an independent structure to keep people free from threat of corporate enclosure and to ensure all code is available under fair licensing. That way, your personal software remains on a surface that bends to human intention. 

Dialog Cornerstone

Dialog-DB is an embeddable, local-first database and forms the foundation of the Tonk substrate. It's the first part of an open source ecosystem that we’re building out.

Dialog has a rich data model underneath the hood. It stores everything as facts, which are semantic triples e.g. (entity, attribute, value). Facts are never deleted, instead they're superseded by new facts or a retraction. This append-only, content addressed design is where much of the power and flexibility of Dialog lives. Building on this, Dialog indexes the data using prolly trees, whose structure only depends on the data indexed rather than their insertion order. This means two replicas can easily compare and merge with conflict-free deterministic semantics ie. it syncs!

However, you don't need to know about any of that when you use Dialog. Instead, the primary way you interact with Dialog is through Concepts and Rules. 

Concepts are a way to group related facts into a structure. This means that you can have multiple views over the same data. So you might say something like, “I need a Person and a Person has a name and a home location and a profile image” and that becomes the Person concept. I could then maybe create a ClubMember Concept which uses the name and profile photo facts, but leaves out the home location.

Rules are a way to derive concepts from existing claims and concepts. So you might say something like, “a FamilyMember is any Person who has a name matching on the second part of any other and they share the same home location.”

Once you have these two things, then you can easily add new Rules, Concepts or even extend existing Concepts. You simply declaratively express what data you expect and it works.

In addition to its simplicity, Dialog is meant to be highly portable. It doesn't require any servers to run, all you need is a storage layer available like S3 or R2. There are also both native Rust and WebAssembly targets available so you can run it anywhere.

If you’re interested to know more about how Dialog-DB works, please see this fantastic presentation by its architects, Chris and Irakli.

Features

  1. User-centric means local first, which means that the data always stays with you. 
  2. Flexible because the dialog data model is designed for extensibility and adaptability. You can easily change the structure of your data without having to go through migrations. 
  3. Open sharing simple and on your terms, always. That's how we've designed it.

For more details, see the Tonk Roadmap

Getting Started

Tonk is currently still running internal tests, but if you would like to try out early experiments, please join our Discord. Introduce yourself and we'll get you set up.