DeSci from first principles

Here are the notes from a talk I gave at the Oxford Blockchain Society / HomeDAO in May 2023.

It’s about the scientific system, how it works, the bottlenecks that stop us scaling it today, and why decentralized primitives might help.

(If this resonates, please drop me a message on LinkedIn/Twitter and I’d love to chat)

Welcome!

I’m David Kell.

I’m currently an ML engineer at Epoch, where we’re engineering plastic degrading enzymes.

I’ve previously worked as an academic scientist, built tools for data scientists and ran a data science community.

I’m passionate about science and making science more effective!

Eventually I’m going to talk about DeSci (aka decentralized science).

But before we get there, I’d like to talk about science more generally, and why things don’t seem to be working well at the moment.

Because I think the real problem in science is not what most of us are talking about.

So let's start by discussing what science is.

We live in a society where economics is possible.

We're just trying to grow the economy, i.e. do productive stuff that makes our and the lives of our children better.

And to do that, we have different asset classes, with different risk/return profiles.

And science is basically on the extreme of this. Science is the economic activity with the highest risk/reward profile.

In fact it’s so risky that there isn’t really an economic model for funding it, other than via the state and private grants.

So what does this high risk activity involve?

Scientists use new technologies to design experiments.

Many (most?) of them fail.

Sometimes they succeed, and we learn how the world works.

This informs the development of even better technologies, which extend our faculties and enable further scientific discoveries.

(And sometimes make us rethink how the old technologies worked, leading to scientific revolutions.)

This is a positive (viral) feedback loop.

Today we spend percentage points of GDP on this speculative activity, and it mostly works, which is pretty awesome.

But … there are some problems. So let’s talk about those.

I think it’s a fair summary to say that the experience of being a scientist today is very frustrating, and particularly in academia.

(This list could be a lot longer but I just want to give you a flavour)

What is going on?

Most of the explanations I’ve heard tend to involve a scapegoat - publishers, funding, “bad science”, and so on.

I used to believe many of these, but I don’t find them compelling anymore.

I think the existence of so many competing explanations points to something deeper.

Something that has changed, that explains why science used to work and why now it’s struggling.

So what has changed?

Well … the number of scientists has gone up 25x in the last 100 years.

And if you take any system - a startup, a software application, a city - and scale it 25x, it won’t be 25x as productive. It might even get less productive, due to all the new bottlenecks.

So you will need to change the architecture of the system.

But it seems to me that the fundamental architecture of science (i.e. universities, labs, publishers, papers) has not really changed in that time.

And so no matter how many scientists we throw at the system, the output (i.e. economic growth) has flatlined.

We have a scaling issue. So to fix science, we need to understand scaling.

Scaling is about making systems bigger.

Systems are made of things.

To scale a system, either you make the individual things bigger (vertical scaling), or you make more of the things (horizontal scaling).

Eventually you hit constraints due to fundamental limitations, e.g. there are too many things and they can’t all communicate with each other.

You overcome these by building “bigger” things out of the “smaller” things.

And repeat, until you have a multi-level hierarchy of abstractions built on top of each other.

So for example, in a database, the “thing” is a record of data and the goal is to manage lots of records.

You can write a basic database in an afternoon.

But it will struggle to scale as soon as you try to handle billions of records.

To solve that, you’ll need to introduce all sorts of abstractions - like indexes, concurrency control or nodes - and that’s the really hard part, to make that all work.

But the core idea is really simple.

(I’m using databases as an example because many of these concepts, like horizontal/vertical scaling, come from databases and the challenges of scaling them for internet-scale applications).

Biology takes this to the next level.

At its heart, biology is about self-replicating chemicals that encode information (and that’s all it was 4 billion years ago, or so we think).

But this is natural feedback loop, and it wants to scale (i.e. make more chemicals).

It turns out, the most effective way to scale is not to make a giant soup of chemicals - for all sorts of reasons, mostly to do with physics again.

And so through the process of evolution, nature has built a stack of abstractions that scales this feedback loop up billions of times.

And there are many more examples.

Very often, I think you can take a system in the world and distill it down to a core mechanism or feedback loop.

And then all the complexity comes from scaling that system, building in layers of abstractions to handle all the constraints of physics.

But this is unintuitive, because we tend to have this intuition that systems scale linearly and so we tend to be “scaling blind”.

And I think that’s the situation with science.

At its heart, science is a simple feedback loop.

Scientists do experiments, share the ideas, and build new technologies.

This enables other scientists to build new experiments, and probe further into nature.

But for example, it’s not efficient to have thousands of scientists communicating 1:1 all over the world.

It scales better to write papers, send them to publishers, and distribute to everyone via a journal. So publishers help solve the bottleneck of scientist communication.

And like that, we’ve scaled the system and built many abstractions.

And very roughly, and making all sorts of simplifications, here’s how the system looks today.

Despite the simplicity of this diagram, I think it’s enough for us to start to have a real conversation about specific scaling bottlenecks.

And start to think about potential solutions.

(Solutions plural, because scaling a complex system usually involves lots of interlinked changes and there is no simple fix.)

And you can also see the appeal of commercial labs/biotech startups, where you make a single research lab 10/100x bigger and remove all the “bad stuff”.

On the plus side, in my experience you can build interdisciplinary teams to tackle challenges in a way that is not possible in academia. So I think this is an important indicator of where we are going.

But … the downside is that commercial labs/startups are less open. This is a pure “vertical scaling” solution.

The strength of science is its horizontal scaling nature, with research labs working independently all over the world.

And as a general observation, empirically, in the long term, horizontal scaling beats vertical scaling, because vertical scaling eventually hits some kind of constraint.

Examples: multi-cellular organisms, open marketplaces, distributed databases.

And that naturally leads us onto DeSci, or decentralized science.

Decentralized technologies enable us to scale systems beyond trust boundaries.

In principle, this enables us to build more scalable alternatives to many of the entities in science today - like a single publishing system, or a global research institute.

These systems are implemented in code and run on a global, shared computer (aka a blockchain).

This is all relatively new, but here are examples of active projects today.

IP-NFTs provide a decentralized system for intellectual property ownership.

In principle, this removes the bottleneck with university tech transfer - making it easier for startups and investors to find IP and bring new technologies to market.

ScienceDAOs offer a decentralized alternative to the roles traditionally played by funding bodies, universities, journals and conferences.

I think of this as a networked (i.e. online only) attempt to build interdisciplinary research labs, while also exploring ways to remove many other bottlenecks in science (like grants).

It’s very ambitious but I imagine we’ll learn a lot whatever happens.

LabDAO is building computational tools that run on shared infrastructure.

Nowadays there is an undersupply of software engineers in science. If we can make computational tools easy enough for non-coders, this potentially helps with that bottleneck.

What does this mean for the future?

As with any new paradigm - some things will work, and some won’t, and it will depend on all sorts of context dependent factors and it’s very hard to predict.

But in general, I think that the solutions that succeed will be those that remove the bottlenecks in a pragmatic way, enabling us to scale the core feedback loop of experiments, technology development and new ideas.

And my gut feeling is that DeSci will mostly augment, rather than replace the scientific system as we know it today.

This is essentially what happened in the open source software world, and I think there’s a strong analogy here.

And so, to conclude…

Science isn’t working today and it’s very frustrating.

I think it’s very natural to want to find someone to blame for that.

But I’ve personally found it helpful to think about this in terms of scaling issues - because scaling a system is a scientific challenge, and we know how to do that.

That makes me optimistic for the future, and excited to continue to explore ways to remove bottlenecks in science.

And that’s why I’m excited about DeSci.

Thanks for reading.

If you enjoyed this … no you don’t have to post it on social media, but do share the link with your science friend on WhatsApp and spread the word.

Let’s scale science!