The Data Science DAO and On-Chain Trends

Interview with Alex Svanevik

Alex Svanevik is a leading on-chain data scientist providing cutting edge insight into the blockchain world. In this conversation we discuss his path into crypto, his involvement with D5 - The Data Science DAO and Ethereum ETL, his opinions about the most interesting investment questions and much more.

Interview date – 20 January 2020

Alex Svanevik @asvanevik

Jay Bowles: Welcome Alex and great to meet you! Let’s start with you telling us about your background and path into crypto.

Alex Svanevik: My background is actually AI and machine learning. So that’s what I studied at university and the first thing I did career-wise was to start a small AI company along with two coursemates from uni. I did that for a while and then went into management consulting because I wanted to learn more about the business side of things. And then I went back to kinda hardcore data science machine learning for one of the large European media groups. So this was in 2014 and then 2017, similar to you in that sense, Ethereum popped up on my radar, like it did with many people. I’d also heard about Bitcoin many years before and I didn’t really find it that interesting because I didn’t buy into that whole ideology and so on.

I’m Norwegian, so there is a high level of trust in the government and institutions and so on. So the value point of Bitcoin, as such, is maybe less strong. Just given my cultural context. I never really got into Bitcoin but when I heard about Ethereum, I really liked the idea of it being a platform where you can essentially build anything, because it’s Turing complete and a world computer for value. I got into that, luckily, before the big ICO boom really took off.

JB: When was that?

AS: I remember I was talking to a bunch of guys at the office about Ethereum because some of them were early investors. This was during the summer of 2017.

JB: Yeah, yeah, similar to me.

AS: I think with many of these things, people tend to focus a lot more on ‘oh when did you get into crypto?’ I think it’s largely arbitrary, or coincidences that lead you to it. Or luck, I should say.

JB: Yeah well, I mean it’s interesting. I think it kind of ties into what you were just saying about how much Bitcoin resonated with you, because it didn’t resonate with me that much either. I think maybe the reality is that you get into crypto whenever there’s something that pops up that’s relevant to your background and your worldview. I mean, my background is in tech as well so it just spoke to me a lot more strongly than Bitcoin. Anyway, I interrupted you.

AS: Yeah, yeah, no, no absolutely. From an investment perspective, I always had this doubt with Bitcoin that it might be the Myspace of cryptocurrencies. It might be the first one, but you might be betting on the wrong horse. I like the Ethereum idea of it being a platform because it feels like you get more, you’re buying into a portion of an economy rather than just making a bet on just one currency.

I mentioned I bought my first ether and I started playing around with ICOs and the lot. I actually got really into the ICO space as well to be honest. It’s not really something that gives you a lot of cred in the crypto space but I actually really liked some of the, let’s say, the more idealistic reasons people got into ICOs. And the sense that it was kind of disruptive towards VCs because it was a more direct way of doing fundraising and also that as a normal person on the street, there’s no way you can get into Uber at an early stage. I like that aspect of it as well. I was monitoring the ICO space and tokens quite closely. Started collecting data on it and I did some stuff around data on ICOs that I was fortunate to find an acquirer for, so they actually bought a database that I built up for ICOs. I think Ethereum’s gonna work out. I think it’s gonna change lots of aspects of economic activity in various different ways. I figure that it’s data and so being in that niche of crypto and data should be a pretty interesting and attractive place to be. So initially I left my full-time job to go full time into crypto. This was early 2018, just pretty much when the bear market started. I worked full-time on crypto and data for a while and I actually joined a company that was in that space. It didn’t work out in the end, but I was fortunate enough to spend a lot of time looking at crypto data and so all of 2018 I was looking at different APIs, different data sources, analysing on-chain data. This led me to in early 2019 assembling a group of people and create this DAO called D5. And this DAO is essentially a third way where on the one hand you could go into employment and work for a company 9-5 and do well. Or you could do something that’s more free, where you do freelancing on your own. The idea with the D5 DAO, we’re kind of in between where you effectively individually control your time 100% but you have access to some of the benefits that you would have from being in a company. In particular, when you’re a freelancer, or an independent consultant or a contractor, the issue is with the supply and demand of work. Have you done any freelancing or consulting work yourself?

JB: Yeah actually. When I first left my job, I was doing consulting.

AS: So you probably know that, typically, you either have too much work or too little.

It’s like there’s always an imbalance of supply and demand. The thing with D5 is that if you’ve got too many clients coming in, you can pass them on to other members of the network because you know the other members of the network are really high-quality data scientists and data engineers. That’s the niche that D5 operates in. On the contrary, if you don’t have that much work right now, you could announce that to the other people in the network and there is a good chance they might have some extra work for their client or you know another client that is coming in to get some help on data science and data engineering. So that was the starting point for D5 and very quickly, we proved that this model works. There’s also an incentive and revenue share model behind this. You get a referral fee if you pass on work to someone else and you also get an increased stake of the DAO itself, so you get more ownership of the DAO. The more work you refer in to the network, the more work that you can pick up and deliver on, because that’s kind of like value capture for the DAO. The DAO takes a small transaction fee. Think of it as like a tax or something on the billed amount but it’s very, very low. That helps us fund any expenses and so on.

JB: Interesting. How is D5 organised? Is it there a large software component which is managing who gets access to funds or is it more manual like Moloch?

AS: It’s very, very organic and it’s pretty much all Slack and Discord and so on. What happens is a client comes to me and they say: ‘oh we need help getting Bitcoin data into a graph database’, or something like that. I’m busy right now and so I will say I can’t do this right now but I’ll check with one of the other guys in D5 and I just post that in our Slack group. If any of the other guys have time to do it, they might pick it up and I’ll just intro them to the client and then if they do the work then there is a small finder’s fee for me but the DAO also takes a small fee. 85-90 % of the billed amount goes to the person actually delivering it.

JB: I see, it’s sort of like a formalised freelance consulting network.

AS: Yeah exactly. The thing is, these parameters that I was talking about: how big should the referral fee be, how big should the network fee or the tax, if you want to call it that, that’s all subject to voting and the voting is based on the token holder share. So it’s like standard token governance. Just to be very clear, right now, a lot of the stuff is not enforced by smart contracts. A lot of the stuff is still trust based because we are only seven people in the DAO and we have a high degree of trust with each other but every time we refine this model we will probably implement it as proper smart contracts.

JB: I think that’s a good way to start. Sounds very practical.

AS: Yeah. It’s a bit more agile. We don’t wanna sit down one year and write a white paper and think about all the edge cases only to find out that it didn’t really work. We’d rather just start experimenting and this is a good environment for us to experiment.

Just briefly, a bit more on the structure of the DAO. What I explained now, we call it a relayer model because you relay work into the network but we also have two other components to how the DAO works. The other model is a bounty model because there’s one inevitable problem, which is that not everything that needs to get done will have a client paying for it.

So we need to create a website or we need to create a social media presence and like engage with potential clients. That kind of work is funded by bounties and these bounties are just tasks that we agree a price on and the price is paid out in our own native token so the D5 shares the cost. And so if you build a website, you get maybe the equivalent of a thousandth of a share of the company. So there’s a relayer model and a bounty model and the last thing is the venture model and this is work that gets created or incubated internally in the DAO. There’s no client paying for it but we believe that this could be a product that we could generate cashflow from or that we could sell to some other company in the future. And that kind of work could be funded by the DAO itself and the DAO would get shares in that so if we sell in the future and if there’s cashflow, the DAO would then benefit from that.

That’s been more recent so I think one month ago we kind of started formalising this venture model with you know some set of parameters and so on. And we’re working on our first venture product right now, which is what I have been sharing a lot of screenshots from on Twitter lately.

JB: What kind of work are you doing at D5?

AS: Under the umbrella of data science and engineering, we do lots of different things but about half of what we do in the consulting space is blockchain related and the other half is in other industries. You know traditional finance, telcos, gaming, there’s other industries out there that have needs for science and data engineering. Right now, we’re also spending a bit of time on this venture product which is an Ethereum analytics product essentially.

JB: What are the most interesting things which you’ve happened upon since you started this new venture product?

AS: Yeah so actually a lot of stuff. I’m almost surprised every time I log in to this new product we have, because it’s kinda like a newsfeed but like an on-chain newsfeed. You can go in and think “hey what’s going on with Maker” for example. Instead of reading on Twitter what’s happening, you can just look at what large transactions took place, which balances have been net positive, which have been net negative and I kinda like that perspective, that you can guide your focus. Like hey there was this fifty thousand Maker transaction, what’s going on with that? And then you go and look on Twitter right? Instead of the other way around. I think there’s been a lot of interesting stuff going on with Maker so I think it’s worth paying attention to where the Maker tokens are going. In particular because of the governance structure of that project. So I’d say for anyone who is looking at on-chain data it’s interesting to look at the Maker tokens that are going out of the multisig address or contract for Maker. So without getting too specific on that, keeping track of where those tokens go is quite important to understand the Maker ecosystem, because those tokens have a lot of power. In theory, they could even make more Maker tokens and dilute the supply. Not that they would do that but you know, that’s one trend or interesting thing to keep an eye on.

I’ve seen instances of what appears to be decentralisation theatre in the sense that you might have one entity controlling a lot of Maker but then they split up the tokens across different wallets. Making it appear like votes are more decentralised than they actually are.

JB: I saw that tweet. How did you come across that? Actually, how did you even get the impression that those wallets were all from the same person?

AS: As part of this product that we’re building, it’s called Nansen by the way. We have this kind of… I call it the god mode of token profiler dashboards. Basically, that’s the landing page if you want to understand what’s happening with the token in this dashboard you see, you know what’s the trend on tokens moving in or out of exchanges, the balances, the change, the large transactions that take place and so on. In this particular case, it was literally a matter of me going into the god mode dashboard and seeing that there are some large transactions taking place and they are coming from the same wallet and I can see the same token moving out directly from those wallets. It was just very clear in the dashboard that there is something going on here. And when I dug deeper, I could see that there’s one entity which is moving funds out of three different wallets and those three different wallets are then moving their funds into the voting contract and so it seemed pretty clear that this is to create the impression that there are three voters when it is really just one.

Alex in God Mode

Another trend which goes more across different projects is just keeping an eye on the share of tokens that are on exchanges.  There’s a few other sites doing this. I think what we are doing is a bit different because we are actually including all the user wallets as well. Let’s say you have an account on Binance. When you have your funds on Binance to send it to your unique user deposit wallet. No one else uses that wallet and so with our product, we have managed to find a way to actually tag up all the deposit wallets as well. That means we get an even more precise assessment of how many tokens are actually on an exchange and I see really big differences with different token projects on the proportion of tokens that are sitting on exchanges. The trends are quite different. Some token projects basically get an almost ever-increasing supply of tokens on exchanges which effectively increases that circulating supply on those exchanges. There are other projects where you see the exact opposite where like it’s almost like the investors are hoarding the tokens. So pulling tokens out of the exchanges, they are effectively decreasing the circulating supply.

JB: Any notable examples of the two of those – tokens moving onto exchanges and tokens moving off exchanges?

AS: There are two notable ones. Kyber network is a great project and I don’t mean to slam on them but it is an example of a project that has a lot of tokens on exchanges. I tweeted about this before, but basically if you just plucked out the amount of tokens that are on exchanges, meaning that they are available for purchase on exchanges, it’s been linearly going up much since that start of the ICO.

JB: Oh wow, not a good sign for Kyber.

AS: They have a lot of great things going for them but that aspect is worth keeping an eye on if you are an investor.

JB: I’m curious if you noticed any difference about the token redesign announcement recently? Did that slow it?

AS: That’s a great question, I’ve seen it. The number of tokens on exchanges actually have peaked it seems and it’s reversing slightly or at least it’s flattened out. But yeah, I’m keeping an eye on it. It will be very interesting to see if the staking model works, then hopefully investors will be pulling out their tokens from exchanges. Which effectively decreases the circulating supply. That’s kind of the whole idea of staking in the first place. That’s why they do it. An example of the exact opposite is Synthetix which because they have staking, incentivises people to pull tokens out of exchanges. So there you see the exact opposite trend. It’s been basically removed slowly from exchanges since they launched. Those are probably the two clearest examples, where you have massive differences in how many tokens are available for purchase on exchanges.

JB: It’s super interesting. I always come back to the fact that like I always think this kind of thing has been happening in startups since the inception of startups. Startups don’t have a real time price feed. The shares aren’t distributed to the public during the private phase so you can’t see this sort of stuff. What’s so interesting to me having spent so much time thinking about startups in the past is that now there are all these startups, and you can actually see very clearly, particularly people like you, with the tools to do so, can see very clearly what’s going on at any given time and what’s actually working and what’s not working. So it’s this really interesting revolution in how startups are assessed I think.

AS: For better and worse. I mean, when you have this amount of visibility, you also have massive influence of sentiment, and in many cases irrational, sentiments. It’s really interesting.

JB: Very disruptive to the projects themselves.

AS: Yeah it can be. I’ve noticed that in some cases, the more you learn about specifics of the token model and the more people are actually able to model out the consequences of a specific token model, the more deflated the expectations on price become. Because if you have a very obscure token model and no one understands, then it could be worth anything. But if you have a token model you could model with discounted cashflow or some kind of burn model or whatnot then you can get very specific and mathematical about it. What happens in some cases, sadly, that sometimes means that you actually deflate the price because you can’t justify a $500 million market cap.

JB: Yeah that’s super interesting. I follow Chris Burniske quite closely and he’s done a lot of work into valuation models and maybe that’s what we are beginning to see. There’s sufficient understanding of token models now that at least with these sort of tokenised equity tokens, maybe this is the basis of a model. Because in the past, in normal equity markets, these models are so well established and so well refined but crypto is so new that we don’t have those models yet. So yeah, really curious and interested to see how this view of token models evolves.

AS: Also, I think you touched on an important point because in the traditional markets, pretty much all stocks follow the same model. They have dividends and that’s how it works. You might have buybacks and stuff, but you can model that and it’s the same model for everything. But with tokens, it’s like there’s an infinite amount of incentive systems or mechanism designs that you can build into the tokens. The design space is infinite potentially. So that makes it a lot more interesting and a lot harder as well.

JB: Yeah. That takes me back to what you said a moment ago about how unusual token models or token models which are relatively novel in comparison to most of the ones to date, can really deflate the price and can not work in the market very well. That makes me think that maybe that’s a factor in itself which will lead to more constraint and a pressure on the design space itself. People don’t want to do something new that’s not gonna lead to good performance in its price. Interesting stuff.

AS: Absolutely. I think we’ll see a convergence towards models that have been proven. That’s more seen with Kyber for example, I think. They’ve learned from the SNX model and they are like: ‘hey this seems to be working, their metrics are solid, price has been going up, people are pulling out tokens from exchanges and so maybe we should adopt something similar.’ So I think we’ll see a convergence towards, although the space is infinite, we’re gonna see clusters that will pop up in that space and I think most models will converge on those.

I did some work on trying to categorise exchange tokens a while back and I think there could be something to that because if you pick a specific niche like exchange tokens and by this I mean like the BNBs and the Huobi tokens you could see these exchange tokens basically copying each other’s models. You get fifty percent trading discounts, there’s a burn model and so on and you could pull out maybe ten different features of these exchange tokens and I think you could probably study like: ‘hey if an exchange token has this particular aspect, let’s say a burning mechanism, how has that impacted the price over time?’ for example. That kind of analysis has not been done that much actually, and because it’s such a broad space and still, we don’t have that much data. But yeah, really interesting to do that visualisation or just standard statistical analyses.

JB: That sounds really promising. What are the big questions you’re looking into from an investing perspective?

AS: So as we said earlier, I think the main interest for me on tokens is still this supply/demand aspect. So what drives tokens to be hoarded, for example. What actually works as a sink for tokens, where people pull them out of exchanges and then lock them up somewhere. And although the whole ICO space has not been very positive in most regards, I do think that it’s still one of the more interesting sections of the crypto space.

JB: You mean explicitly the token distribution?

AS: Yeah. Basically understanding what drives maybe not so much supply, both supply and demand or at least circulating supply or demand. So for tokens it’s still just like a sandbox of all these projects that are building very innovative technologies. They have made some kind of bet on token model but the actual consequences of what the market believes about these models is, there’s only one way to find that out and that’s like looking at the data and seeing over a, let’s say, five-year or ten-year horizon, how did these metrics get impacted by their choices. And when a project like Kyber changes their token model, how do their metrics get impacted? It’s not a very specific answer to your question but basically, I’d say what drives the circulating supply and the demand aspect of tokens. That’s one thing I am very interested in.

I think you could also broaden that question to also include Ether especially now that Ethereum 2.0 is coming out and there will be a staking mechanism there. So I think we will probably face some of the same questions for Ether as we have for these other tokens.

The other thing that I am now personally spending a lot of time on is basically classifying wallets, so finding good ways to segment and classify different Ethereum wallets. I’m not sure you’d call it a research question, but it is more like an infrastructural component that aids in other analyses. So we’ve got almost 40 million wallets labelled up right in our database at the moment and trying to find better ways to accurately label these wallets is something that I spend a lot of time on.

JB: When you say label, what are some examples of labels? How are you categorising wallets?

AS: Yeah so some of them are like entity related, so as I mentioned earlier like Binance deposit wallets or like Binance user wallets. For example, this is an address where people are sending funds and those funds go into the Binance main wallet and you know there are certain graph aspects of the transaction that can let you classify the wallet with very high precision as being a Binance deposit wallet. That’s entity related and there’s other stuff that’s more, let’s say, behavioural or even just categorical or category related. So things like “is this a DEX trader”, “do they have a CDP or a Maker vault”. That can, in many cases help you understand what wallet it is. For example, if a wallet is a DEX trader, it is very unlikely that it is an exchange related wallet. Because exchange-related wallets are managed by their own wallet and privately managed systems. You wouldn’t see any DEX trades happen with exchange related wallets. Those are some categories. There is also some stuff that’s a bit more iffy, for example, ENS related wallets. If a wallet has acquired like a .eth address, that’s on-chain right? Not everyone understands the privacy aspect of that.

JB: I hope they do. I had never really thought about the fact that they might not.

AS: Yeah. I also hope they do but you know, you’re basically telling the world that this is my wallet. It’s probably not the only wallet you have but if you do trading or if you do stuff with it, then people can figure out that it’s yours. Not with a hundred percent certainty, but if you have your .eth name on twitter and then there’s a .eth wallet out there, there is a good chance that that’s actually the one that belongs to this person. That information is kind of iffy because on one hand it feels a bit personal. On the other hand, it’s public information that’s on-chain and will be on-chain forever so… That’s another example of labels that we get because they are in the public domain. We don’t, just to be clear, we don’t collect any information about people’s personal wallets beyond that. Because this stuff is already in the public domain and then for the certain notable public figures, the Vitaliks of the world we would have that as well if it’s in the public domain.

JB: Cool. So what’s the next step? What are you going to do with all the labelled wallets?

AS: Just two weeks ago we started opening up to the first customers for this platform. We’re kind of bundling it up as an analytics product where you can better understand transactional flow, on-chain. So, in addition to just having all these wallets, we also have all sorts of very, very clean and structured on-chain data. Transactions, all token transfers, all sorts of smart contract events that are emitted and that kind of stuff. We can join that with the wallet labels and that gives you very powerful analytics and insight products. We’re basically gonna offer it as a product for anyone who wants to better understand what’s happening on-chain.

JB: Interesting. I noticed that you have been heavily involved with Ethereum ETL. Can you tell us a little more about that?

AS: Ethereum ETL was started by Evgeny Medvedev. He’s also in the D5 DAO. He basically wanted to start parsing out Ethereum on-chain data. Transactions, token transfers, et cetera. If you’ve ever looked at Ethereum data, if you try and get it from a node and so on, it’s not that convenient to deal with if you want to do analyses and so on.

The short version of Ethereum ETL and how it started was like we just said: ‘hey can we just get this data as CSVs so we can load it into a database?’ I was working on a similar project and then I came across Evgeny’s work on Ethereum ETL just at the beginning. I think he had been working on it for ten days or something and I came across his project and I ended up hiring him for the company I was working for at that time. Then we started working closely together and he kept working on it and then a bit later I got in contact with a guy called Allen Day at Google, who is very heavily involved in Google’s work on blockchain and trying to get developers to work with Google in-house and that kind of stuff. I introduced him to Evgeny, and the big break, let’s say, for Ethereum ETL was when Google basically announced that they would publish public data sets in Google BigQuery.  If you go to the BigQuery public data sets, which is a project in BigQuery, you find all the Ethereum transactions, token transfers and so on. Those data sets are powered by Ethereum ETL. They use Ethereum ETL to parse out the data from the full Ethereum node and that’s when a lot of people suddenly got the possibility to just run SQL on Ethereum data. This was back in 2018. To take a step back from Ethereum ETL. The idea is just to make it very easy for anyone to work with Ethereum data. If you wanna do analyses, if you wanna build data applications, data products on top of Ethereum data, Ethereum ETL is kind of the open source not-for-profit project that you should use.

JB: How up to date, how real-time, is the data?

AS: The core datasets for Ethereum are close to real time. They have about a four-minute delay.

They have that delay to make sure that there’s no reorganisations of the chain. After four minutes it is very likely that there will be no reorgs on the chain. So yeah, like four minutes.  For Ether transactions, token transfers, internal transactions, logs, it’s around a four-minute delay, so close to real time. And we also have a separate section where we parse out specific contract events so things like 0X transactions or Kyber trades, that kind of stuff. And those are up to one day or 24-hour delayed. Because we only run those jobs once a day.

JB: What’s your assessment of the current state of on-chain data in general?

AS: Yeah I think it’s a really good thing that there are so many projects focusing on on-chain data. I think from a business perspective or an investment perspective. I do think the on-chain data eventually will become a commodity. There will be lots of different ways to interface with on-chain data in a very seamless and easy way. That’s where I think the space is headed. I think we’re gonna have super easy access to on-chain data. To some extent, we already have it, it’s just maybe people aren’t, in many cases, aware how easy it is to access on-chain data. So that’s one perspective that I think on-chain data is going to become commodity, you know… free, easy to use. That’s a really good thing because that means you get more transparency of the whole blockchain system, blockchain economy. And then the other perspective I think, for investors looking to make bets on the blockchain analytics space, I think the power lies in like combining on-chain and off-chain data. That’s what we are focusing on just to be very clear about that, with Nansen. The blockchain itself doesn’t really tell you which entities are related to the different wallets and there’s a lot of indexing work you need to do on top of the on-chain data for it to be truly valuable. I think, as a bet on the blockchain analytics space, that’s where the magic happens. When you combine the on-chain and off chain data and naturally, what you can actually do with that data.

Maybe the third thing I would say, largely I think it’s uncharted territory, what you can do with blockchain data. There’s a lot of stuff you can do with it. Just to give you one example, Evgeny, who’s the founder or creator of Ethereum ETL and I, we created a token recommender. This was just a fun pet project we did to show what you could do with on-chain data. It basically looks at your token transfer history in your wallet and then it recommends what is the next token you should buy. Like Amazon; if you like this, then you might like this as well. There’s a lot of machine learning products that are really interesting and can be built on blockchain data. I think that’s largely unexplored and I think in the next couple years I think we’ll see a lot more products that will be built on top of blockchain data.

JB: That’s really cool. Had you released that? The token recommender, or is that something you kept to yourself?

AS: Yeah we actually put it on, we have a Medium post called ‘Machine Learning on Ethereum: Data Recommending Tokens’. The tool itself was live but isn’t working anymore.

What I think is cool about something like that token recommender is that in the machine learning and data science world, the largest companies in the world have a massive advantage because they sit on all the data. Google, Facebook, Netflix and so on have all the data so they’ll always have the biggest advantage. With blockchain, it’s cool because any hobbyist, any person sitting in their bedroom hacking away could get access to the whole blockchain and all the data that has ever been generated, on-chain, on Ethereum and they can build really cool machine learning or data products on top of that and so that’s kind of what we were trying to illustrate with this pet project. I think that’s a really cool thing that you don’t centralise the power of sitting on all that data to just these massive corporations.

JB: That’s fantastic. Thanks very much for taking the time to speak Alex, and really looking forward to seeing the progress in your various projects.

AS: Thanks Jay! Great speaking with you.


To hear more from Alex—

Jay Bowles · 17 Feb 20
Tweet this post
Latest Spotlight
THORChain Decentralizes Digital Asset Exchange
Learn more →
Content and network profiles aimed at the discerning crypto investor.

Sign up
for our newsletter

Sign up now