Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Disentangling Decentralized Finance (DeFi) Compositions

Disentangling Decentralized Finance (DeFi) Compositions STEFAN KITZLER, Complexity Science Hub Vienna and AIT - Austrian Institute of Technology, Austria FRIEDHELM VICTOR, Technische Universität Berlin, Germany PIETRO SAGGESE, AIT - Austrian Institute of Technology and Complexity Science Hub Vienna, Austria BERNHARD HASLHOFER, Complexity Science Hub Vienna, Austria We present a measurement study on compositions of Decentralized Finance (DeFi) protocols, which aim to disrupt traditional inance and ofer services on top of distributed ledgers, such as Ethereum. Understanding DeFi compositions is of great importance, as they may impact the development of ecosystem interoperability, are increasingly integrated with web technologies, and may introduce risks through complexity. Starting from a dataset of 23 labeled DeFi proto,663 cols ,881 and 10 associated Ethereum accounts, we study the interactions of protocols and associated smart contracts. From a network perspective, we ind that decentralized exchange (DEX) and lending protocol account nodes have high degree and centrality values, that interactions among protocol nodes primarily occur in a strongly connected component, and that known community detection methods cannot disentangle DeFi protocols. Therefore, we propose an algorithm to decompose a protocol call into a nested set of building blocks that may be part of other DeFi protocols. This allows us to untangle and study protocol compositions. With a ground truth dataset we have collected, we can demonstrate the algorithm’s capability by inding that swaps are the most frequently used building blocks. As building blocks can be nested, i.e., contained in each other, we provide visualizations of composition trees for deeper inspections. We also present a broad picture of DeFi compositions by extracting and lattening the entire nested building block structure across multiple DeFi protocols. Finally, to demonstrate the practicality of our approach, we present a case study that is inspired by the recent collapse of the UST stablecoin in the Terra ecosystem. Under the hypothetical assumption that the stablecoin USD Tether would experience a similar fate, we study which building blocks and, thereby, DeFi protocols would be afected. Overall, our results and methods contribute to a better understanding of a new family of inancial products. CCS Concepts: · Applied computing→ Digital cash; Electronic funds transfer. Additional Key Words and Phrases: Decentralized Finance, DeFi, Blockchain, Ethereum, Networks 1 INTRODUCTION Decentralized Finance (DeFi) stands for a new paradigm that aims to disrupt established inancial markets. It ofers inancial services in the form smart ofcontracts, which are executable software programs deployed on top of distributed ledger technologies (DLT) such as Ethereum. Despite being a relatively recent development, we can already observe rapid growth in DeFi protocols enabling lending of virtual assets, exchanging them for other virtual assets without intermediaries, or betting on future price developments in the form of derivatives like options and futures. The term łinancial legož is sometimes used because DeFi services comp canose bed into new inancial products and services. Authors’ addresses: Stefan Kitzler, kitzler@csh.ac.at, Complexity Science Hub Vienna and AIT - Austrian Institute of Technology, Vienna, Austria; Friedhelm Victor, friedhelm.victor@tu-berlin.de, Technische Universität Berlin, Berlin, Germany; Pietro Saggese, pietro.saggese@ait. ac.at, AIT - Austrian Institute of Technology and Complexity Science Hub Vienna, Vienna, Austria; Bernhard Haslhofer, haslhofer@csh.ac.at, Complexity Science Hub Vienna, Vienna, Austria. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for proit or commercial advantage and that copies bear this notice and the full citation on the irst page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior speciic permission and/or a fee. Request permissions from permissions@acm.org. © 2022 Association for Computing Machinery. 1559-1131/2022/10-ART $15.00 https://doi.org/10.1145/3532857 ACM Trans. Web 2 • Kitzler et al. EOA 1inch KYL KYL Fig. 1. A DeFi composition whereUSDT tokens are swapped againstKYL tokens through the DeFi service 1inch in a single transaction.1inch executes the swap sequentially through the DeFi services SushiSwap and UniSwap, using WETH as an intermediary token. In the transaction trace graph, we can see the user calling1inch the smart contract, which in turn triggers several calls to DeFi protocol-, and token smart contracts. As an example of a DeFi composition, consider Figure 1, which illustrates a user interacting 1inchwith the decentralized exchange (DEX) aggregator Web service . The user holds an amount ofUSDT tokens and wants to swap them to KYL tokens. Using the Web application andeher xternally owned account (EOA), she creates a transaction against the 1inchcontract, which in turn triggers a sequence of two swaps on two DeFi protocols within the same transaction, frUSD omT to WETH on SushiSwapand thereafter fromWETH to KYL on UniSwap. In this paper, we study such single transaction DeFi interactions and the networks that arise when combining multiple DeFi transactions. 1.1 Motivation In 2021, the total value of tokens held by smart contracts underlying the DeFi protocols has reached 106 billion USD [12], demonstrating rapid growth. As composability of DeFi protocols is frequently seen as one of the main advantages (cf. [36]), there are multiple reasons why it is interesting to study DeFi compositions: Ecosystem interoperability. While composability can be seen as an opportunity, single transaction compositions as shown in Figure 1 currently only work within a single distributed ledger. Most of the emerging DLT scaling solutions, such as sidechains 27, 38 [ ], rollups40 [ ], and of-chain networks15 [ , 37], lead to multiple, somewhat isolated DeFi ecosystems. Hence, composability is disrupted, as smart contracts on one platform cannot invoke contract functions on another platform within a single transaction. Understanding which types of compositions are frequently used may help in developing solutions to cross-chain 4, 20, 46[, 52] composability. Until solutions are found, such knowledge can help in deciding which services should be co-located, and which services could be separate. https://app.1inch.io (this and all the following links were accessed on June 15 2022) ACM Trans. Web Token DeFi Protocol User Contracts Contracts Disentangling Decentralized Finance (DeFi) Compositions • 3 Integration with Web technologies. Cryptoassets have started integrating with various Web technologies. For example, the Brave browser includes an integrated cryptoasset wallet and native use of BAT tokens, and various applications from the commercial BitTeorr cosystem ent rely on the BitTorrent Token (BTT). This raises the question regarding the interdependence between DeFi compositions and web technologies. Services like Furucombo already illustrate that almost arbitrary DeFi compositions are constructed through Web interfaces. In order to develop an understanding of this, however, it is important to identify compositions and their points of interaction in the irst place. Risks through complexity. After its deregulation in the early 2000s, the securitization market became more complex and opaque. Financial institutions used new inancial instruments to maximize their exposure in this market. They were based on technical computer models and traded by highly leveraged institutions, many of whom did not understand the underlying models. These instruments were highly proitable, but the lack of any infrastructure and public information about them created a massive panic in the inancial system that began in August 20072[]. DeFi protocols may ofer opportunities, such as technological innovation or new governance models. However, their composability adds additional complexity and opaqueness to an already complex cryptoasset ecosystem, which currently has a market valuation of about 1T USD . If these protocols are not understood and adopted more broadly, they could have unforeseeable systemic efects on inancial markets and our society as a whole, as seen in the 2008 inancial crisis 21]. A re[cent example involving DeFi protocols is the collapse of the stablecoin protocol Terra and its associated cryptoassets LUNA and UST. While the protocol did work as designed, its stabilization mechanism was not robust to signiicant selling pressure in the advent of market participants panicking. This ultimately led to deleveraging spiral 22] destr efe oying cts [ over 30B USD of value within a single week and rendering institutions with large exposures to LUNA or UST insolvent. In addition, the stablecoin UST was used as part of compositions in many other DeFi protocols on the Terra blockchain and through bridges on diferent blockchains, thus afecting the entire ecosystem [35]. Previous work (cf.,10[, 16]) has partially studied risks in the DeFi ecosystem, showing possible strategies that allow rational agents to maximize their revenues by subverting the intended design of DeFi protocols, for example in DEXs and lending protocols. However, none of the existing studies have systematically investigated compositions of DeFi protocols, which form complex, interconnected inancial instruments. 1.2 Contributions Our work aims to analyze DeFi protocols and to develop a novel algorithmic method that helps to understand protocol compositions. We can summarize our contributions as follows: (1) We provide a manually curated ground truth of 1407 addresses from 23 DeFi protocols and deriv ,663 ed,881 10 associated Ethereum smart contracts. These are labels that can be reused in future research. On this basis, we propose two network abstractions, representing interactions among DeFi protocols and smart contracts (Section 3). (2) We study intertwined DeFi protocols from a macroscopic perspective by analyzing the topology of both networks. We ind that DEX and lending protocols have high degree and centrality values, and protocol interactions primarily occur in a strongly connected component. We also ind that known community detection algorithms can only indicate DeFi compositions but cannot efectively disentangle them (Section 4). (3) We address the microscopic transaction level and propose an algorithm for extracting the building blocks of DeFi protocols. We apply the algorithm to all protocol transactions in our ground truth, identify the https://brave.com https://www.bittorrent.com https://furucombo.app https://coinmarketcap.com/charts/ ACM Trans. Web 4 • Kitzler et al. most frequent building blocks, and ind that swaps are the most frequent ones. We show how the observed space of compositions looks like for Aavthe e protocol. Further, we also demonstrate, using 1inchand Instadapp as examples, how to disentangle and visualize the building blocks of a single protocol as a treemap (Section 5.1). (4) We present an overall picture of DeFi compositions by extracting and lattening the entire nested building block structure across multiple DeFi protocols. The results show that DeFi aggregation pr1inch otocols , 0x( or Instadapp) are, as expected, heavily intertwined with many other DeFi protocols, which conirms that our algorithm works as intended (Section 5.2). (5) Finally, we present a case study illustrating how a hypothetical run on the stablecoin USD Tether would afect the building blocks of individual DeFi protocols. (Section 5.3). We detect a comparatively high dependency ofCurveinance building blocks to the USDT cryptoasset. We believe that our results are an essential contribution towards understanding DeFi compositions. On a microscopic level, our proposed methods can be used to assess the composition of individual protocols. On a macroscopic level, they show how DeFi protocols and their implementations are connected with each other. For this paper, we limit our scope to the largest Ethereum Virtual Machine (EVM)-based blockchain Ethereum, but in principle the approach can be used and applied to any other EVM-based platform. For reproducibility of results, we make our ground truth dataset, including the labels as well as our source code, openly available at https://github.com/StefanKit/Untangling_DeFi_Composition. 2 BACKGROUND AND DEFINITIONS We now establish preliminary terms and deinitions that are used throughout this work and introduce the related works. 2.1 Ethereum Account Types Ethereum is currently the most important distributed ledger technology (blockchain) for DeFi 53ser ]. Itvices [ difers from the Bitcoin blockchain conceptually as it implements the so-called łaccount modelž with two diferent account types. An externally owned account (��� ) is a łregularž account controlled by a private key held by some user. A code account (�� ), which is synonymous with the notion łsmart contractž, is an account controlled by a computer program, which is invoked by issuing a transaction with the code account as the recipient. A CA must always be initially calledexternal by an transactionoriginating from��an � , but a CA can itself trigger otherCAs. In the latter case, the interaction, which is also known as łmessage,ž is denotedinternal as an transaction. Several branches of internal transactions with varying depth can follow an external transaction, resulting in cascades, which altogether are calle tracesd. CAs allow users to implement application-layer protocols, which are essentially programs that can follow some standardized interface Tokens . are popularCA-based applications and a way to deine arbitrary assets that can be transferred between accounts. The program behind a token manages token ownership and can implement a standardized interface like ERC20, which deines functions standardizing token transfer semantics. 2.2 Decentralized Finance (DeFi) Protocol A DeFi protocolis an application-layer program that provides inancial service functions such as swapping or lending assets. More technically, we can deine it as follows: Deinition 2.1.A DeFi protocol � is a decentralized application that facilitates speciic inancial service functions deined and implemented by a set of protocol-speciic code accounts. The following properties distinguish DeFi services from traditional inancial services: non- irst, they are custodial, meaning that no intermediary such as a bank or a broker holds custody of users’ funds. Second, they ACM Trans. Web Disentangling Decentralized Finance (DeFi) Compositions • 5 are permissionless , meaning that anyone can use existing or implement new services. Third, transpar they areent, which means that anyone with the necessary technical capabilities and skills can investigate and audit the state of protocols. The fourth is that DeFi protocolscomp are osable. 2.3 DeFi Protocol Compositions The last property,composability , is the most crucial for this work and requires more detailed description: CAs can call each other, and their individual functions can be arbitrarily composed into new inancial products and services (łFinancial Lego49 ž) []. While this analogy is widely used in the literature, to the best of our knowledge, no work investigates which arethe basic composable building blocks of more complex inancial services and how they are related. Harvey et al.19 [ ] refer broadly to composability as asset tokenization and networked liquidity, while Von Wachter et al.44[] conceive composability narrowly as a repeated wrapping operation of tokens resulting in new derivative products. However, as illustrated before in Figure 1, we note that DeFi compositions also involvCA e s, which are not tokens. Also, Engel and Herlihy 13] and [ Tolmach et al.41 [ ] respectively discuss compositions only in the context of automated market makers (AMMs) and of formal veriication CAs relate ofd to decentralized exchanges and lending services, which is again a very narrow conception. Thus, there is no comprehensive, technically grounded deinition for DeFi compositions to the best of our knowledge. For our work, we deine it as follows: Deinition 2.2.A DeFi protocol composition occurs when a protocol-speciic account leverages, within a single transaction, one or more accounts belonging to the same or another DeFi protocol to provide a novel inancial service. 2.4 Related Work Others studied networks closely related to the ones we investigated before us: Guo18 et]al. are[amongst the irst to investigate the Ethereum transaction graph, inding that volumes moved and the numbers of transactions follow a power law distribution, that the component structure follows a bow-tie model, and that negative assortativity is plausibly explained by the presence of service providers such as exchanges. Chen 7]etconduct al. [ a systematic study of Ethereum between 2015 and 2018 and exploit graph analysis measures to describe three diferent network constructions (money transfer, smart contract creation, and smart contract invocation). Another systematic study has been conducted by Lee et al. [24], who analyzed the local and global properties of interaction networks extracted from the entire Ethereum blockchain statically inding heavy-tailed degree distributions. In a follow-up, Zhao et al. [54] analyzed the temporal evolution of Ethereum interaction networks and found that they proliferate and follow the preferential attachment growth model. Furthermore, several studies focus on the network of Ethereum’s tokenized assets: Somin et al. 39],[ for instance, studied the combined graph of all fungible token networks, while Victor and Lüders 43][ explored the networks of the top 1,000 ERC20 tokens individually. Fröwis et al. [14] proposed a method for detecting token systems independent of an implementation standard. Also, Chen et al. [8] conducted a systematic investigation of the whole Ethereum ERC20 token ecosystem and analyzed their activeness, purpose, relationship, and role in token trading. Other studies exploited network methods for the detection of speciic nodes using graph-based approaches. Poursafaei32et] al. dev[eloped a method based on graph node feature extraction and graph representation learning techniques to identify illicit no 25]des. Li et al. [ and Ofori-Boateng et al.30[], instead, respectively use Topological Data Analysis (TDA) to detect price anomalies and hidden co-movement in pairs of tokens, and for anomalous events detection in a multilayer network. However, none of these related works considers networks that represent DeFi Protocols and their relationships. Another growing body of research concentrates on speciic functions ofered by individual DeFi protocols or types of protocols. We are aware of many DEX-related measurements focusing on protocol-speciic aspects, such as the magnitude of cyclic arbitrage activity 47], the[behavior of liquidity providers 48], or the [ role of ACM Trans. Web 6 • Kitzler et al. oracles as providers of external information 26]. Other [ studies focus on lending and borrowing services: Perez et al. [31] analyze liquidations and related participants’ behavior in the DeFi Comp pround otocol , while Gudgeon et al. [17] compare market eiciency, utilization, and borrowing rates in diferent lending protocols. Also, Wang et al. [45] provide methods to identify lash loans in three diferent DeFi providers and measure their related activity. Finally, we are aware that von Wachter et44al. ] inv [ estigate composability from an asset perspective and measure composability by identifying the number of derivatives produced from an initial root asset. However, we apply a more technical, service-oriented perspective and consider, to put it simply, a DeFi composition as being a computer program utilizing other programs’ functions. Overall, we are not aware of previous studies providing a comprehensive picture of DeFi compositions across various protocols. We also do not know any work that analyzes in detail the building blocks of individual DeFi protocols. With this work, we want to close this gap. 3 DATASET AND NETWORK CONSTRUCTION This section describes the data we collected and the network abstractions we constructed for subsequent analysis steps. 3.1 Dataset collection To study DeFi compositions, we are interested in transactions between Ethereum code accounts associated with known DeFi protocols. Thus, we used on-chain transaction data from the Ethereum blockchain and built a ground truth of knownCAs and their associations to DeFi protocols. 3.1.1 On-chain transaction data. While Ethereum’s history goes as far back as July 2015, DeFi only emerged as a popular term around summer 2020, when these protocols irst saw increased usage. This informed our choice of the analysis time frame and the ability to refer to external sources providing information on popular, established DeFi services. We used an OpenEthereum client and ethereum-etl to gather all Ethereum transactions from 01-Jan-2021 (block 11,565,019) to 05-Aug-2021 (block 12,964,999). We collected each external transaction and also parsed its cascade of internal transactions, which together givetrace us the . For each transaction, we extracted the source and destination account addresses, the transaction hash, the transferred value, the transaction type (call, create, or self-destroy), as well as the trace ID, which indexes the transactions by their execution order. Additionally, we collected the method ID of the 4-byte input sequence, which allows us to identify the signature of called methods using the 4Byte lookup service . To distinguish betweCA ens and EOAs, we gathered all code account creation transactions from the CA irst created on Ethereum until the end of our observation period. We also use cr these eation tracesto associate each CA with its creator CA. In total, we found 46 ,112,390 CAs and used the output byte sequence to identify ,324 143 contracts conforming to the ERC20 standard. 3.1.2 Ground truth data. To be able to analyze DeFi protocols, we need a ground truth dataset on which smart contracts are part of a given protocol. We focus on the most relevant protocols regarding valuation and gas-burned between 06-Mar-2021 and 05-Aug-2021, using monthly samples of the top three total-value-locked (TVL) protocols from DeFi Pulsefor each inancial service category. Additionally, we consider protocolsCA including s of the top ten gas burner listin the observation period. The result deines the set of DeFi protocols we want to investigate. Table 1 reports summary statistics for the 23 protocols in our sample, divided by category. The last column https://github.com/blockchain-etl/ethereum-etl https://www.4byte.directory/ https://deipulse.com/ https://ethgasstation.info/gas-burners ACM Trans. Web Disentangling Decentralized Finance (DeFi) Compositions • 7 reports, for each protocol, the average share of each protocol’s TVL with respect to the entire DeFi ecosystem, between March and August 2021. In total, our 23 DeFi protocols cover more than 81% of the entire DeFi TVL. According to DeFi Pulse, in August 2021 more than a hundred DeFi protocols existed, but only around 30 (of which 18 in our sample) had a TVL larger than 200M USD. Most of the protocols in our sample are still the most relevant ones for TVL as of June 2022. In the following, we briely introduce the categories and protocols as reported by DeFi Pulse: • Assets identify the category including cryptoasset management protocols, such as yield aggregators, that aim at maximizing the value of a portfolio or basket of underlying Harvestinance assets. , Yearn, Vesper, share a similar mechanism, whereby they pool resources which are in turn invested in other DeFi platforms according to diferent optimization strategies. Users are typically rewarded through tokenized assets. Convex enables Curveinance liquidity providers to earn additionalBadger rewards. allows Bitcoin users to deposit tokenized Bitcoin such as wBTC and consequently generate a yield, by following programmatic optimization strategies. Similarly RenVM bridges , digital assets across DeFi ecosystems by minting ERC20 tokens on Ethereum with 1:1 ratio Fei . ’s protocol builds on a decentralized stablecoin backed by cryptoassets exploited through yield strategies established by the protocol’s governance. • Derivatives protocols allow issuing synthetic inancial instruments in the DeFi ecosystem, either tracking other cryptoassets or real-world of-chain assets. Synthetix, for instance, supports several real-world assets, such as iat currencies and metals, while dYdX allows investors to trade perpetual positions on the underlying cryptoassets. Hegicenables the issuing of ETH and wBTC call and puts options. Futureswap users can open leveraged long and short positions on cryptoassets. Nexus, instead, provides inancial insurance instruments that cover potential losses users might incur in; similarly Barnbridgeofers , tools to hedge risk through its inancial instruments. • DEXs, i.e. Decentralized Exchanges, allow users to exchange cryptoassets. UniSwap, SushiSwap, Curve- inance, Balancer all exploit Automated Market Makers (AMM), as well as bonding curves and constant functions to algorithmically set the cryptoassets prices, 0xwhile is based on the order book mechanism. The 1inchprotocol aggregates information on liquidity from several DEXs and routes transactions to those ofering the best prices. • Lending protocols provide investors with automated markets for loanable funds: lenders issue interest- bearing instruments and borrowers can take positions, typically conditional to the provision of collateral that covers potential losses. Aave and Compound follow the model described aboMaker ve. users lock their cryptoassets as collateral and receive the DAI token in return. Instadapp follows a more complex scheme and acts mostly as an aggregator of multiple DeFi protocols. After identifying the most relevant DeFi protocols, we manually colle CActe s asso d the ciated with each protocol. Since this information is not available on the blockchain, we rely on of-chain and publicly available sources like protocol websites and available documentation. We resolved conlicts of duplicate CA to proto d col assignments and identical names by querying CA addresses on Etherscan and uniquely assigned each CA address to its original protocol and obtained a unique label. We denote these manually collected datasepeoints d data as and make them available as part of our source code repository. Next, we extended our seed data by implementing a heuristic that uses the creation transactions and iden- tiies the CAs deployed by each seed address. By default, all extended addresses inherit the label and protocol assignments from the corresponding seed address. If the procedure leads to a conlict of labels for an address, 10 out of the irst 11 DeFi protocols for TVL in DeFi Pulse are in our dataset. DeFiPulse reports the protocols divided into ive categories. We don’t include Payment thecategory because services like Polygon provide of-chain functionality rather than composable inancial services or products. https://etherscan.io/ ACM Trans. Web 8 • Kitzler et al. Table 1. Ground truth dataset summary statistics. Seed addresses were manually collected for each DeFi protocol. The extended seed are heuristically derived and include also further created code accounts from the seed addresses. Number of addresses DeFi Protocol type Seed Extended seed External calls % TVL Protocol Badger 64 278 258,773 1.09% Convex 22 131 147,855 1.13% Fei 40 37 146 ,691 0.28% Assets Harvestinance 101 803 119,631 0.46% RenVM 15 15 234,161 0.86% Vesper 44 44 94,189 1.19% Yearn 3 3 243,036 3.54% Barnbridge 40 46 55 ,588 0.17% dYdX 38 38 107,264 0.14% Futureswap 9 10 6484 0.04% Derivatives Hegic 8 8 8372 0.03% Nexus 24 26 20,067 0.57% Synthetix 271 272 611 ,942 2.55% 0x 28 50 2,094,335 - % 1inch 15 10 ,338,305 1,277,641 0.52% Balancer 9 3473 281,530 2.29% DEX Curveinance 163 267 745,672 9.28% SushiSwap 12 1705 ,2026,674 5.37% UniSwap 15 54 ,038 28,394,798 8.30% Aave 157 166 851,578 13.31% Compound 67 65 741,069 11.48% Lending Instadapp 72 32,770 97,080 7.39% Maker 190 231,261 2,992,692 11.77% we preserve the one obtained through the heuristic. Combined with our seed data, these extended addresses form ourextended seed data set. Table 1 summarizes the number of seed and extended addresses collected for each DeFi protocol. It shows that our automated expansion does not increase the number of addresses associated with DeFi protocols for assets and derivatives. However, it massively expands the dataset for DEXs and lending protocols utilizing automated factory contract deployments. In particular, more than 10 million CAadditional s are associated with 1inchdue to the factory contract that deploys gas tokens. The last column shows the number of external transactions directed to each of our DeFi protocols. The distribution is heterogeneous, and again the most relevant categories are DEX and lending. UniSwapis the most frequently appearing one, with a gap of around one order of magnitude to the second one, whichMaker is . 3.1.3 Dataset reduction. As we are only interested in known DeFi protocols, we inally limited and reduced the traces data set to the subset ofprotocol traces, where the initial external transaction originating from an ACM Trans. Web � �� 1 �� 2 �� 7 �� 3 �� 8 �� �� �� 4 5 DeFi Protocol Network ��� �� 1 �� 2 �� 7 ��� �� 3 �� 8 �� ��� 3 �� 4 �� 5 DeFi CA Network Disentangling Decentralized Finance (DeFi) Compositions • 9 Fig. 2. Schematic illustration of constructed networks. The lower-level DeFi Code Account (CA) network represents inter- actions betweenCAs. The higher-level DeFi Protocol Network models relations between DeFi protocols. Lower-levCA el vertices are associated with higher-level protocol vertices. CAs are triggered byEOAs or otherCAs. EOA triggers aCA address in our extended seed dataset. This reduction allows us to investigate and interpret compositions within the context of known protocols. 3.2 Network construction In our analysis, we want to understand and discover relations between DeFi protocols and assoCA ciate s. For d that purpose, as shown in Figure 2, we constructed networks consisting of DeFi traces on two abstraction levels: the lower-level DeFi Code Account (CA) Networkand the higher-level DeFi Protocol Network. The DeFiCA Network includes all known ground truth CAs triggered by external transactions from arbitrary EOA addresses and allCAs subsequently called by cascades of internal transactions. We noteCA that s in the network can or cannot be associated with a DeFi protocol in our ground truth dataset. We construct the network by iltering all internal and external transactions bCA etwsefr enom theprotocol traces. Since repeated usage of DeFi services results in recurring transaction patterns, we aggregate and count transactions with the same source and destination address. The DeFi Protocol Network represents interactions between protocols. We constructed it by merging all DeFi CA vertices associated with the same DeFi protocol into a single node. We note that we modeled both networks as a directed graph, in which vertices represent either a protocol or aCA single . The weighted edges represent the aggregated set of transactions between DeFi protocolsCA ors. 4 TOPOLOGY MEASUREMENTS We now analyze the constructed networks from a macroscopic perspective. Since our research focuses on understanding DeFi compositions, we do not aim at conducting an encompassing study of the entire Ethereum topology, as it was done in previous studies (see Section 2.4). This supports our choice to focus on a narrower number of targeted metrics that provide relevant insights on composability aspects; other approaches that are beyond the scope of our work are discussed in Section 6.2. The analysis of the degree distribution and centrality measures can help identifying the CAs implementing core functionalities, and the reciprocity and assortativity ACM Trans. Web 10 • Kitzler et al. Table 2. Summary statistics of the analyzed networks. DeFi CA network DeFi Protocol network Nodes 2,536,371 43,624 Edges 3,472,757 84,789 Self-loops 6668 146 Average degree 1.369 1.944 Density 5.398e-07 4.456e-05 provide additional insights on the relationships across such CAs. To understand how CAs associated to the same protocols interact with each other, we investigate how the network is separated in diferent components and whether known community detection algorithms identify community structures that overlap or not with the protocols structures. We start by reporting basic summary statistics for the DeFi CA network and the DeFi Protocol network in Table 2. The main diference is in the network dimension, the latter being two orders of magnitude smaller. The presence of self-loops indicates that some contracts include multiple functionalities and thus can also call themselves. Both networks are sparse, as shown by the average degree and density measure, suggesting that CAs tend to interact with only a few other CAs. 1e+00 1e-02 1e-04 1e-06 1e+01 1e+03 1e+05 Degree Fig. 3. Degree distribution of the CA ( ) and Protocol ( ) networks are shown in the plot as cumulative distribution function ˆ ˆ ˆ ˆ (CCDF). The estimated parameters � = (� , �ˆ) are respectively� = (93, 1.69) and � = (25, 1.83). In both networks, ��� �� � high-degree nodes are associated to DEX or lending protocols. For the CA network, they are routing contracts or factory contracts that deploy other contracts. Nodes with high degree are likely to contain core functionalities and thus to play a relevant role in compositions. ACM Trans. Web CCDF Disentangling Decentralized Finance (DeFi) Compositions • 11 Table 3. Likelihood ratio and p-value. None of the reported heavy-tailed distributions is favored over the power law. DeFi CA Network DeFi Protocol Network Exponential R: 1.322, p-val: 0.186 R: 4.753, p-val: 0.000 Lognormal R: -0.406, p-val: 0.685 R: 0.191, p-val: 0.848 Weibull R: 1.122, p-val: 0.262 R: 2.742, p-val: 0.006 4.1 Degree distribution Looking at the total-value-locked at DeFi Pulse, we can observe that some DeFi protocols and their contracts play a major role. This observation suggests that they might implement core functionality, which other protocols in DeFi compositions can in turn utilize. Under this assumption, preferential 3, 33 attachment ] is a plausible [ generative mechanism for both networks. More generally, networks whose degree distribution follows a power −� law, i.e., the fraction of vertices with�degr isegiv e en by �(�) ∼ � for values of � ≥ � , are often associated to ��� ˆ ˆ such generative mechanism. We thus estimate the parameters � = (� , �ˆ) for our two networks and investigate ��� if the power law distribution is a good it. We rely on the methodology introduced by Clauset et al. 9] and [ by Broido et al.6]: [ evidence of scale-free properties exist either when no alternative heavy-tailed distribution is relatively better than the power law or when the power law is a plausible model for the distribution. In the former case, the network Supexhibits er-Weak scale-free structure. In the latter, evidence of scale-free properties is said Weak toif be the tail of the distribution ˆ ˆ contains at least 50 nodes, and Strong if also<2�ˆ < 3 holds. We start by estimating the parameters � = (� , �ˆ) ��� by minimizing the KolmogorovśSmirnov distance between empirical and itte � d data , andfor exploit it to ��� estimate�ˆ through the method of maximum likelihood estimation 9]. We then [ conduct a goodness-of-it test via a bootstrapping procedur�e (= 5, 000). The resulting p-value indicates if the power law is a plausible it (� ≥ 0.1) for the empirical data or not. Finally, we conduct a log-likelihoR)otest d ratio to compar ( e the power law it against other heavy-tailed distributions (i.e., the Exponential, the Lognormal, and the Weibull). A positive value indicates that the power law distribution is favored over the alternative, and the statistical signiicance is supported by a p-value that indicates if the hypothesis R = 0 is rejected�( < 0.1) or not (� ≥ 0.1). Figure 3 shows the power law it for both networks and their estimate � dand �ˆ. Coherently with other ��� studies on the interaction networks from Ethereum blockchain24data ], � [lies around 1.7 and 1.8, thus being slightly smaller than the average values usually found for power law distributions. The hypothesis that a power law distribution is a good it is not plausible for both networks because p-values are 0.020 and 0.035 CA for the and Protocol networks, respectively. Table 3 reports the comparisons with other heavy-tailed distributions and shows that the power law is not signiicantly favored over the Lognormal distribution for both networks, while it is a better it than the Weibull and the Exponential for the Protocol network. In summary, according to the classiication proposed in Broido6et], al. both[ networks have Super-Weak scale-free properties. Table 4 inspects the tails of the distributions and reports the CA tops15 sorted by highest degree: most of the CAs are associated with a few DEX and lending protocols 1inch ( , UniSwap, 0x, Instadapp, Maker). We can hypothesize that they are part of DeFi compositions, which we will explore further in subsequent sections. 4.2 Centrality measures The results in the previous section highlight the relevant role of DEXs and lending protocols. Network centrality measures are another helpful tool to determine which nodes might implement core functionalities. We consider the In degree centrality, as we are interested in identifying relevant contracts that other protocols may use in DeFi compositions. To add further insights, we also provide the results for the Katz and PageRank algorithms. Katz ACM Trans. Web 12 • Kitzler et al. Table 4. First15 CAs by highest degree. Address Label Protocol Degree In degree Out degree 0x00000000000049... CHI Token 1inch ,2713,153 305,627 2,407,526 0x7a250d5630b4cf... UniswapV2Router02 UniSwap ,007 56 1711 54,296 0xc02aaa39b223fe... EtherToken-v4 0x 54,469 45,129 9340 0x5c69bee701ef81... UniswapV2Factory UniSwap ,408 46 26,576 19,832 0x2971adfa57b20e... Mainnet-InstaIndex Instadapp 34 ,497 18,369 16,128 0x4c8a1beb8a8776... Mainnet-InstaList Instadapp ,33 551 16,956 16,595 0x5ef30b99863452... CDP_MANAGER Maker 15,300 8940 6360 0x35d1b3f3d7966a... MCD_VAT Maker 15,214 15,214 0 0xa26e15c895efc0... PROXY_FACTORY Maker 13,718 1 13,717 0x0000000000b3f8... GST2 Token Unknown 13,447 7644 5803 0x11111112542d85... contractAddress 1inch 12 ,371 2073 10,298 0x6b175474e89094... MCD_DAI Maker 12,314 12,314 0 0xdef1c0ded9bec7... ExchangeProxy-v4 0x 11,147 1138 10,009 0x939daad09fc4a9... mainnet-v1-InstaAccount Instadapp 10 ,876 10,876 0 0xfd3dfb524b2da4... N/A Unknown 10 ,554 1547 9007 centrality accounts for the importance of a node’s neighbors. It is an extension of the eigenvector centrality that addresses issues arising with directed netw28 orks ] by[adding a constant initial weight to each node. PageRank takes into account the Out degree of nodes to control for the drawback of the Katz algorithm that peripheric nodes might get too high values if linked to a very central node. The values of each centrality metric are normalized to the range [0,1]. We ind that both networks are dominated by a few nodes with relatively high values (for all centrality measures) with respect to the other nodes; the In degree values are almost always higher than the Katz ones, which in turn are often slightly larger than the PageRank centrality values. Table 5 reports the values for the nodes with the highest centrality in the Protocol (left) and the DeFi CA (right) networks. We show only the irst three nodes because the others have relatively smaller values in comparison. In the Protocol network, the most central nodes are two non-labeleCA d s. When considering the ranking of the nodes in the highest 10 positions for at least one centrality measure, 10 DeFi protocols appear in the highest positions, Uniswap, and in particular, plays an important role. Such protocols are thus heavily used by other non-lab CAs in eledour dataset. Uniswap, 0x and Maker have higher centrality values with respect to the other protocols. The DeFi CA network is dominated by the 1inchfactory contract mentioned in Section 3.1.2 that deploys CHI tokens. Two other nodes with relatively high values are the wETH CA related to 0x and another factory contract associated withUniswap. Considering again the nodes ranking in the highest 10 positions for at least one centrality measure, CAs associated toInstadapp and Maker appear repeatedly. Factory deployer contracts play a major role in the DeFiCA network. Note that, by deinition, such contracts have a high Out degree, as their functional role is to deploy other contracts. Interestingly, the In degree centrality results show thus that they also have a relevant role as recipients of calls by other contracts of the network. In conclusion, these results are consistent with the indings of Section 4.1 in showing that DEX and lending protocols play a major role and may be involved in compositions. ACM Trans. Web Disentangling Decentralized Finance (DeFi) Compositions • 13 Table 5. In degree, Katz and PageRank centrality measures the three most central nodes. For the Protocol network (let), the column Address/Protocol reports the address of non-labeleCA d s or the protocol name associated to the node. For the DeFi CA network (right), the column Protocol_Address reports the protocol associated to CA the and theCA itself. Protocol network DeFi CA network Address/Protocol In degree Katz PageRank Protocol_Address In degree Katz PageRank 0x0000000000b3f8... 1 1 1 1inch_0x00000... 1 1 1 0xcc88a9d330da11... 0.371 0.168 0.111 0x_0xc02aa... 0.148 0.107 0.053 UniSwap 0.313 0.176 0.092 UniSwap_0x5c69b... 0.087 0.064 0.036 4.3 Reciprocity and assortativity Next, we look at two measures that provide information on the relationship between nodes and their neighbors, that is, reciprocity and assortativity. Reciprocity is the likelihood that nodes are mutually linked. Values range from 0 to 1, the former meaning that the network is purely unidirectional, the latter indicating that all links are reciprocated. For both the DeFi CA and the Protocol networks, the values (respectively 0.234 and 0.215) are similar to the one reported in 24],[and we follow their interpretation that the presence of reciprocated links is a potential sign of composability, as it shows that smart contracts tend to rely often on each other. The lower value obtained for the Protocol network could be explained by the presence of many non-labeled (non-protocol-speciic) CAs. If we further reduce the Protocol network by removing all non-labCA eles,dobtaining a graph abstraction of 23 nodes, the reciprocity (0.677) is much higher, indicating that protocols interact with each other more often and in a bidirectional way, a sign that compositions exist. Assortativity is a metric that indicates whether nodes with similar degrees tend to interact with each other > � >(10), or if nodes with high degrees interact more with low degree nodes (> 0 � > −1). Consistently with previous results on the Ethereum transaction network, both networks are disassortative (-0.473 for the DeFi CA network and -0.262 for the Protocol network), indicating heterogeneity and a sign that CAs with high degree are leveraged by many other CAs with a less relevant role in the ecosystem. As shown above, such nodes are often associated with DEX and lending protocols. 4.4 Components Reciprocity shows that protocols interact bidirectionally with accounts related to other protocols. We thus look at metrics providing further insights on how the (code accounts of) diferent protocols fall into distinct disconnected components. We distinguish betwewen eakly connected components, in which all the nodes are connected by a path independently of the directions of the edges, strand ongly connected, which considers the edge direction. For the Protocol network, we ind that the largest weakly connected component is equal to the entire network, while for the CA network, only 34 nodes are outside of the largest component. The remaining nodes fall into 16 components, with a few nodes each. Table 6 lists the three largest strongly connected components. By comparing the number of edges and nodes, we notice that the second-largest component of both the Protocol and CA the network is denser than the other larger components. Additionally, in Figure 4 we illustrate CAho s bw elonging the to diferent protocols map to the ten largest strongly connected components of CA thenetwork. Interestingly, the second-largest component also encompasses the vast majority of protocol interactions. While the largest component is entirely composedCA ofs associated with the 1inchprotocol, in the second-largest component, we ind addresses of all the analyzed protocols exceptRenVM for , which is not present in any of the reported large components. We also ind that all the protocols fall into the second-largest strongly connected component regarding the Protocol network. This analysis shows that interactions among protocols primarily occur in a ACM Trans. Web 14 • Kitzler et al. 10 (9) Node count 9 (11) 1e +05 8 (15) 1e +04 7 (16) 1e +03 6 (20) 1e +02 5 (34) 1e +01 4 (36) 1e +00 3 (5622) 2 (69,116) 1 (305,581) Protocols Fig. 4. Heatmap showing how the addresses associated to diferent protocols fall into the ten largest strongly connected components. The largest component is uniquely composed of 305,581 1inch addresses, while the second collects the vast majority of protocols. Smaller components identify addresses of protocols that do not interact outside of the protocol itself. Table 6. Description of the three largest strongly connected components. For both networks the patern is fragmented, but interestingly the second largest strongly connected components are remarkably more interconnected, indicating that nodes in these components interact with many other nodes, a prerequisite for composition. Largest 2nd largest 3rd largest # Comp. Nodes Edges Nodes Edges Nodes Edges Contract 2,155,707 305,581 611,160 69,116 370,833 5622 11,242 Protocol 33,832 5622 11,242 3948 14,264 36 71 single, large component that is more interconnected than average. Notably, such interactions might indicate the existence of compositions due to the overlapping transaction structure of multiple protocols. 4.5 Community Detection One could naively assume that CAs associated with speciic DeFi protocols form communities in the Code Account network. However, the previous results suggest that the network topology relects DeFi compositions at the level of the community structure. We thus measure how efectively diferent community detection algorithms detect protocols in the DeFi CA network. We follow the approach of Yang et al. 51], [ who provide guidelines for selecting community detection algorithms depending on the size of the network. We analyze the weakest largest component in its unweighted and undirected version with non-overlapping communities using four diferent algorithms: multilev5el ], lab or Louvain el [ propagation34 [ ], leading eigenvector 29],[ and Leiden42 [ ]. Using the labeled addresses in our ground truth dataset, we can verify to what extent � , the set of communities identiied by partitioning algorithms, correspond to � , the set of ground truth communities deined by the individual protocols. We quantify their performance through the normalized mutual information (NMI) , a benchmark measure in the literatur11 e ,[23] that quantiies ACM Trans. Web 0x 1inch aave badger balancer barnbridge compound convex curvefinance dydx fei futureswap harvestfinance hegic instadapp maker nexus sushiswap synthetix uniswap unknown vesper yearn Component ids with total node counts Disentangling Decentralized Finance (DeFi) Compositions • 15 Table 7. Performance metrics for the community detection algorithms. Low F1 Scores indicate either that the algorithms poorly identify communities, or that the network topology reflects a more complex organization at the mesoscopic level. Algorithms Communities Precision Recall F1 Score NMI �/� Louvain 14 0.3896 0.7181 0.2917 0.9241 0.6087 Leiden 10 0.3021 0.8589 0.2879 0.9620 0.4348 Label prop. 53 0.7107 0.6009 0.4892 0.9404 2.3043 Eigenvector 4 0.1696 0.9070 0.1776 0.9495 0.1739 the similarity between the ground truth communities and the identiied communities. In addition, we provide two additional measures: the ratio �/� for the accuracy of the number of identiied communities and the F1 score. We compute the latter similarly50to]:[irst, for each protocol � ∈ � we identify the detected community � ∈ � that maximizes the F1 score. Then, we report average precision, recall, and F1 scores over all communities � ∈ � . Note that we compute the above metrics only on the labele CA d s. The second column of Table 7 reports the total number of communities that include lab CAele s. The d NMI is high for all the protocols, indicating that overall the algorithms correctly partition the network: indeed, all algorithms cluster together CAs create thed by the 1inch deployercontract, and 1inchis by far the largest ground truth community in terms of labeled accounts. On the other hand, the low F1 scores (0.18-0.49) result from a small set of misclassiied ground truth communities (e.g., Compound, DyDx, Fei). Upon closer inspection, we noticed that some protocols map entirely into a few communities dominated by larger protocols (such UniSwap as or Maker), negatively impacting precision, while others are split into diferent communities, afecting 1inch recall. itself has a non-marginal number of addresses that map into other communities. In summary, we see that algorithms work well, with NMI scores above 0.92. However, when considering the imbalance in our dataset (precision, recall), we ind that known community detection algorithms cannot efectively identify protocols as distinct communities, but rather indicate protocol composition patterns. The identiied community structure relects a diferent organization in which protocols are entangled. 5 MEASURING DEFI COMPOSITIONS After analyzing the macroscopic network perspective, we now address the microscopic trace level, where we identify and extract building blocks, i.e. recurring patterns of internal traces induced by proto CAcol-sp s that eciic are found as subpatterns within diferent transactions. The building block detection can help better understand DeFi compositions and identify a variety of risks. We consider a detailed risk analysis to be future work, but can motivate some sources of risk: for example, if security vulnerabilities are identiied in underlying building blocks, they can propagate to higher levels and pose a risk to other DeFi protocols. Atzei1]etanalyze al. [ the security vulnerabilities of Ethereum code accounts and attacks that exploit them. Legal issues may arise, including licensing issues, thereby limiting usability in other protocols. This phenomenon also exists in traditional software . Finally, the technical evolution of a blockchain can also have an impact on the eiciency or security of an existing building block, and here too it is important to identify which protocols are afected. Thus, we propose an algorithm to extract the possibly nested structure of DeFi protocol calls, which may also be used by other DeFi protocols. In contrast to recent works, that have discovered and exposed DeFi compositions, we provide a systematic, automated mechanism to explore them by using building block extraction. We then assess the most frequent building blocks our algorithm identiies and illustrate possible DeFi compositions and show how the DEX aggregator 1inchand the Instadapp protocols use multiple such building blocks of other protocols. Further, we latten the nested structure of building blocks and study the interaction of DEX and lending https://www.techradar.com/news/this-popular-code-library-is-causing-problems-for-hundreds-of-thousands-of-devs ACM Trans. Web 16 • Kitzler et al. services. Finally, we present in a case study the dependencies of DeFi protocol on stablecoins, by using our extracted building block. 5.1 Building Block Extraction Algorithm In order to detect building blocks, we treat individual transactions as trees of execution traces, that is, as an abstraction where the external and all the internal transactions are represented as an edge to a new node (thus, the same CA appears multiple times if executed more than once). We break the trees into subtrees, starting from the tree’s leaves, and identify a building block whenever we encounter a node that is part of a protocol. If multiple protocol nodes exist in a tree, the building blocks can be composed of one another. To obtain the nested structure, we create a hash of each building block and use those hashes to chain nested tree structures. Figure 5 illustrates the process from a high-level perspective. Subigure 5a represents the input, which corresponds to the original transaction trace graph which we’ve also shown in the introductory Figure 1. We aim to identify building blocks that execute the same logic despite being diferent instances involving diferent addresses (i.e., a swap with diferent tokens). We preprocess and generalize the execution trace trees as follows: Preprocessing: In contrast to a graph, like in Figure 5a, an execution tree can have the same node appearing multiple times as a leaf node, efectively having no cycles. Each edge has a trace ID, determining the order of the calls. If a contract address appears in a trace that has been deployed by a factory, we rename it to $protocol-DEPLOYED. Furthermore, we rename all contract addresses asASSET, which fulill the criteria that their smart contract code contains the standard ERC20 token method signatures, and if within the trace, the token contract is called with one such method. The result of these preprocessing steps is shown in Figure 5b. This preprocessing assumes that factory deployed contracts and ERC20 token contracts provide similar functionality. This allows us to generalize the traces, as many similar interactions with various standardized tokens become identical. Building block 3 Building Building block 2 block 1 (b) Conversion to execution tree, renam-(c) Identification of general building (a) Original transaction trace graph ofing factory deployed contracts and as-blocks from protocol nodes with sub- the composition as shown in Figure 1. sets. traces. Fig. 5. A high-level illustration of the building block extraction algorithm. Subfigure 5a represents the input composition. This graph is then converted into an execution tree as shown in Subfigure 5b, such that each node can only have one incoming edge, requiring the duplication of nodes. In addition, the underlying assets (tokens) and factory deployed contracts are renamed. In this example, the trading pair contracts are factory deployed (FD). This allows for the identification of generalized building blocks, as each trading pair only diferentiates itself by the specific assets it is dealing with. The result of the building block extraction is then shown in Subfigure 5c, and is the result of a botom-up processing of the tree, selecting subtrees of known protocol nodes. See Algorithm 1 for more details. ACM Trans. Web Disentangling Decentralized Finance (DeFi) Compositions • 17 Algorithm 1: Building Block Extraction Inputs :(1) Directed, attributed transaction trace tr �e(e�, �, �,� ) with functions � : � → N assigning a unique trace ID, and� : � → N assigning a method ID on the edges of the tree, (2) protocol vertices � Outputs:Lists of building blo � cks , and hashes � 1 � ← ( ); // Init. list of building blocks 2 � ← ( ); // Init. list of building block hashes 3 � ← (��)|∀�� ∈ � : � ∈ � ; // Edges to protocol nodes � � // For each edge to a protocol, get subtree 4 � ← (� )| edges reachable from� for each� ∈ � ; � � � � � � 5 � ← (� [� ]) | ∀� ∈ � ; // edge induced subtrees � � � � � � 6 � ← ilter( � , by=tree-depth, minimum=2); � � � � 7 � ← sort(� , by=tree-depth, how=ascending); � � 8 for � (� , � , �,�) ∈ � do // for each subtree � � � 9 // Compute building block hash with � , � , � � � 10 � ← sort(� , by=�(� ), how=ascending); // Sort edges � � ′ ′ 11 � ← (� , ..., � ) = �|∀�� ∈ � : � ∈ � ; // Vert. list 1 � � � � 12 � ← ��� (�)|∀� ∈ � ; // Outdegree list � ��� 13 � ← � (� ); // Method ID list 14 ℎ ← sha256hash(stringify (� , � , � )); � � � 15 � ← � [� ]; // B. block as vertex induced subtree 16 replace(what=� , in= � , with= ℎ ); � � 17 � ← � ∥ � ; // Append building block � � � ℎ ℎ 18 � ← � ∥ ℎ ; // Append building block hash � � 19 end 20 return � , � Algorithm 1 takes as input a transaction trace tre�e(�,�, �,� ) with two edge attributes: the trace ID �, indicating the order of execution, � ,and indicating the method ID of the executed call. The second input is a list of seed protocol nodes, such as those described in Section 3.1.2. The algorithm outputs a list of building blocks and hashes of such building blocks. We irst setup the output variables in lines 1ś2. We then ind edges to the protocol nodes in line 3 and extract all further reachable edges of these to obtain edge-induced subtrees in lines 4ś5. We ilter them in line 6 to include only those with a minimum depth of 2, such that the protocol node has to make further calls. In line 7, we sort the list of subtrees ascendingly based on their depth. This means small trees are at the beginning of the list, and large trees that may contain these smaller trees are at the end. For each subtree (line 8), we compute a hash in lines 9ś14, highlighted in gray, akin to a tree kernel. To compute the hash, we irst sort the subtree’s edges by order of execution in line 10, and then extract the target vertices of each edge in line 11, essentially excluding the original calling node, which could be diferent in each transaction. For each of those vertices, we compute the outdegree (line 12), and also determine the method ID for each edge (line 13). The hash is then computed from the three aforementioned properties in line 14. Using the target vertices, we retrieve the building block from the original tree (line 15), which may contain leaf nodes of building block hashes ACM Trans. Web 18 • Kitzler et al. swap swapExactETHForTokens swapExactTokensForTokens swap withdraw FD FD FD FD A FD A FD A FD A A FD A A A A A A A A A A A A 1) uniswap (21,769,746) 2) uniswap (6,198,521) 3) 0x (5,910,146) 4) uniswap (1,804,012) 5) sushiswap (1,250,574) uniswapV3SwapCallback swap swap uniswapV3SwapCallback swapExactETHForTokens FD FD A A FD A A A A A 6) uniswap (1,037,881) 7) uniswap (1,007,538) 8) uniswap (857,377) 9) uniswap (848,682) 10) uniswap (810,287) Fig. 6. The 10 most frequently observed building blocks by called root method, root protocol and count. Nodes marked with FD are generalized factory deployed contracts and those marked with A are ERC20 assets. The majority of these building blocks originate fromUniSwap. Note that block 1 of UniSwap is equivalent to number 5 of SushiSwap. This makes sense, as SushiSwap is a fork ofUniSwap. Number 1 is contained in building blocks 2, and 4 ś illustrating an internal composition within the same protocol. Building block 3 represents the withdrawal of Wrapped Ether WETH ( ) and is associated to the protocol0x. Also note that several root methods are identical, yet can lead to diferent types of building blocks. as replacing subtrees in line 16 can lead to nested building blocks. Finally, we append building block and hash to their lists in lines 17ś18. Once all subtrees are processed, the lists are returned in line 20. An example of the algorithm’s result can be seen in Figure 5c, showing three building blocks, one each from SushiSwap, UniSwapand 1inch. Note that the building block of 1inch contains the other two building blocks. 5.2 Building Block Analysis We execute the algorithm on all transactions in our dataset, together with the set of DeFi protocols in our labeled extended seed set (cf. Section 3). We can then count the retrieved building blocks by their hashes, understand their composition, and visualize them. Figure 6 illustrates the top 10 most frequently observed building blocks, of which eight belongUniSwap to . The most frequent building blockUniSwap is a swap, with more than 21 million occurrences. As UniSwapis one of the most popular DeFi protocols, and token swaps are its main functionality, this result shows that the building block extraction is meaningful. We further observe that the swap building block is reoccurring and contained in other patterns that appear frequently. Another relevant block is related to 0x’s Wrapped Ether (WETH), which in our context is not classiied as an asset due to its’ use of withdrawal, a non ERC20 function. In the following, we will provide more insights into the nested structure from diferent perspectives and discuss their interpretations. 5.2.1 Protocol Building Block Composition.Starting from the execution tree structure of each trace, the algorithm identiies subtrees. Those building blocks obtained from Algorithm 1 can contain leaves with hashes that point to other building blocks, leading to a nested structure that still preserves the primary tree structure of the traces. But a single transaction only represents a small snapshot of the entire tree of possible compositions. For a comprehensive image of the DeFi protocols composition space, we have to consider multiple transactions. To observe the space of all possible compositions, we construct a network of overlapping building block trees for ACM Trans. Web Disentangling Decentralized Finance (DeFi) Compositions • 19 Building Blocks EOA of DeFi protocols 0x aave aave balancer compound curvefinance dydx hegic maker sushiswap synthetix uniswap Fig. 7. Illustrating the composition space of Aave as a network tree. Each node represents a building block, each link a possible nested building block, extracted from all transactionsAto ave. We observe for this protocol a maximum depth of seven nested DeFi building block levels. all transaction of the same initial (external) DeFi protocol. For an illustrative example, we used the extracted building block structures of all transaction Aaveto . The network still conserves the tree structure, where each node represents a building block and each link a nested composition, observed in the transactions. Figure 7 shows the Aave network and illustrates its multiple nested levels. Starting from the top with external transactions from EOAs to Aave, a variety of paths and compositions can be seen, presenting the space of all possible compositions, observed from existing transaction data. Nevertheless, this network illustration doesn’t provide a comprehensive picture of the volume (i.e. number of appearances) of those compositions and the number of branches, when a building block calls multiple sub-blocks. We can inspect for each building block the set of contained protocols and the volume of their appearances: the treemaps in Figure 8 illustrate the shares of protocols appearing in the building block structure of a speciic nested level. In Figure 8a we observe the volume of building block calls and associated protocols in the irst level for the protocols1inch. The largest fraction are external transactions that do not contain any other building blocks; this is captured by the box labeled NONE as . All other boxes show instead the share of transactions in which one or multiple DeFi service building blocks are nested. We group them using diferent colors based on the number of unique, distinct protocols that are called in the subsequent building blocks of this level. For instance, yellow boxes indicate the fraction of transactions in which the appearing nested building blocks in the irst level are associated to one single DeFi protocol, while blue boxes represent the fraction in which the building blocks in the irst level are associated to two diferent protocols. We further observe portions of transactions that contain building blocks assigned to more than two protocols within the irst nested level. Moreover, the treemap in Figure 8b show branches in a deeper level within Instadapp transactions. In the fourth level of self-compositions, besides the fraction that does not contain any further NONEblo ), an cke(ven ACM Trans. Web 20 • Kitzler et al. 0 1 1 0 sushiswap 1inch 0x uniswap instadapp NONE 1inch NONE uniswap balancer compound 0x,uniswap 2 3 aave (a) 1inch (b) instadapp→instadapp→instadapp→instadapp Fig. 8. Inspecting the potentially nested building blocks used by the first lev1inch el of (let) and the fourth level of Instadapp (right). The size of each box represents the share of building blocks assigned to one or more unique protocols.1inch For transactions, at the first nested level, about a third of the used building blocks are of one (chiefly other) protocol (yellow boxes). An even bigger fraction can be observed for Instadappbut in the fourth nested building block level. bigger share of building blocks appear that are associated to one single DeFi protocol. We also inspect again the existence of building blocks associated to two and more protocols. These two illustrations in Figure 8 give insights to our systematical investigation on compositions, and show that looking only to selected compositions or single nested levels of DeFi compositions would return a partial picture: interactions among protocols can be iteratively nested one within each other and can take place in deeply nested levels. Therefore a further investigation to disentangle and latten the nested structure is needed. 5.2.2 Flatening Composition Hierarchies.We then want to investigate to what extent the DeFi protocols leverage other protocols to provide their services. That means, we want to identify a mapping of top-level protocols to any of the building blocks they make use of, whether deeply nested or not. To get an overall picture of the DeFi compositions, we latten the nested building block structures. In each transaction, we follow the cascade of nested building blocks and create a mapping from the contained protocol of building blocks to the original DeFi protocol that the external transaction was sent to (the root protocol). If mappings appear multiple times over diferent transactions, we aggregate them. For each root protocol, we can then compute the frequency of associated protocols to contain building blocks over all transactions. The result is a measure that indicates, for a given root protocol, the probability that a certain building block of a DeFi protocol appears anywhere in the (nested) building block structure. In Figure 9 we show the building block appearances of lending, DEX, derivatives and asset protocols with a heat map. Each row corresponds to the external calls to a speciic protocol, and the row entries indicate the frequencies of the occurrence of a protocol’s building blocks. The relative share measurement is the fraction of internal building blocks based on the number of external transactions. We notice that NONE the category indicates the share of transactions for which no building blocks have been found. Most protocol interactions exist within each protocol, visible by the highlighted diagonal elements. This pattern is especially remarkable for derivative protocols. Consider dYdX: all , ee.g., xternal transactions directed to it contain at least dYone dX building block. However, DeFi aggregation protocols such as Instadapp, 1inch, and 0x in particular show extensive use of other DeFi services and thus frequent occurrences of DeFi compositions. This indicates Algorithm 1 works as intended, as, by deinition, aggregation protocols must call other protocols. The frequent appearance of the 0x protocol can be attributed to the popular Wrapped ACM Trans. Web Assets Derivatives DEX Lending Disentangling Decentralized Finance (DeFi) Compositions • 21 Lending DEX Derivatives Assets Others yearn 0 0 0 0 0 0 0 0 0 0 0 0 0 0 100 vesper 0 5 0 2 0 1 19 81 renvm 10 90 harvestfinance 2 4 1 2 2 6 3 0 0 0 61 0 39 fei 0 0 0 0 2 98 convex 2 37 0 1 71 0 29 badger 0 0 0 0 1 2 0 0 0 1 76 0 1 0 24 synthetix 0 3 28 0 72 nexus 81 2 19 hegic 14 64 36 futureswap 0 40 60 dydx 12 0 0 100 0 barnbridge 5 2 0 24 69 uniswap 0 0 0 22 0 0 0 0 82 0 0 0 0 0 0 0 18 sushiswap 1 0 16 0 0 0 58 0 0 1 1 0 0 0 42 curvefinance 1 2 0 0 19 0 0 1 0 4 2 75 balancer 2 0 0 19 65 0 0 1 0 0 34 1inch 1 0 0 20 3 5 3 13 48 0 0 2 1 0 0 0 0 38 0x 0 0 0 14 0 1 1 6 20 0 0 0 0 0 0 0 0 72 maker 0 0 0 5 1 0 1 0 0 0 0 0 0 0 0 0 0 94 instadapp 19 15 92 5 15 8 3 2 3 13 8 2 1 0 8 compound 0 37 63 aave 34 0 0 1 0 0 0 1 0 1 1 66 Protocol building blocks appearance [%] Fig. 9. Appearances of DeFi service building blocks across protocols. The numbers indicate the percentage of transactions in which a building block of a certain protocol is contained. The use of multiple DeFi services can be observed for DeFi aggregation protocols, likeInstadapp, 1inch and 0x. Ether token and itswithdrawpattern, already observed and shown in Figure 6. Further, we note that second0x to, UniSwapbuilding blocks appear in most transactions to the protocols shown in Figure 9. Derivatives protocols have instead little or no further interactions with other protocols, as shown in the row associated with derivatives in the matrix of heat maps, as well as the assets protocols that do not interact heavily with other protocols. 5.3 Case Study: A hypothetical run on the Tether In May 2022, we witnessed the collapse of the Terra ecosystem and its stablecoin TerraUSD (UST), which maintained its peg to the US Dollar through an arbitrage mechanism with the token LUNA. This triggered a so-called stablecoin-run and destroyed over 30B USD of value within a single week. Motivated by this recent demonstration of systemic risk associated with stablecoins, we apply our building block extraction and analysis methods to measure how a hypothetical run on the stablecoin Tether (USD,T)which is the most widely adopted stablecoin in Ethereum, would afect known DeFi protocols based on building block dependencies. We distinguish between directdependencies, where USDT is an explicit part of a building blo indir ck,eand ctdependencies, where USDT appears somewhere in its’ nested building blocks. Starting with the most frequent building blocks (see Figure 6), we analyzed the occurrence of USDT in the regularly used sub-patterns of transactions. We 0xdAC17F958D2ee523a2206206994597C13D831ec7 ACM Trans. Web External transaction to protocol aave compound instadapp maker 0x 1inch balancer curvefinance sushiswap uniswap barnbridge dydx futureswap hegic nexus synthetix badger convex fei harvestfinance renvm vesper yearn NONE 22 • Kitzler et al. USDT included directly indirectly DeFi protocols Fig. 10. Dependencies of building blocks on the USDT crypto asset for each DeFi protocol. Distinguished between direct included asset or indirect through other nested contained blocks. detected USDT in 10 .6% of ‘swap’ building blocks fr UniSwap om (1) and 16.2% fromSushiSwap(5). For the ‘swapExactTokensForTokens’ building block frUniSwap om (2), we ind an even higher direct occurrence of.7% 22 and an indirect dependency of further.2% 21with the nested block structure, containing the before mentioned ‘swap’ building blocks frUniSwap om (1). In order to obtain a broader picture of the dependencies in the DeFi ecosystem, we also analyzed, for each protocol, the fraction of building blocks containing the USDT asset directly or indirectly in more deeply nested blocks. Our results, which are summarized in Figure 10, show that most protocols have rather low dependencies (< 10%). However, 14.2% ofCurveinance building blocks include the USDT asset directly and also the two DEX protocolsUniSwapand SushiSwapstrongly depend on that asset. This is in line with our previous inding that ‘swap’ is the most frequent building block. We further ind Comp thatound and Instadapp building blocks have in comparison high indirect dependencies on the USDT asset. These dependencies indicate how a shock in the DeFi ecosystem, such as a run on a stablecoin, could afect DeFi protocols, directly and indirectly, through their services. Since USDT has become a multi-chain asset, which is also traded and used on other blockchains (e.g., Binance Smart Chain, Avalanche), such shocks could also spread across chains and lead to systemic failures. However, we consider this analysis a irst step towards a deeper investigation of systemic risk and keep a deeper investigation for future work. 6 DISCUSSION In this section, we discuss some of the insights from our analyses, as well as the limitations of our work. ACM Trans. Web 0x 1inch aave badger balancer barnbridge compound convex curvefinance dydx fei futureswap harvestfinance hegic instadapp maker nexus renvm sushiswap synthetix uniswap vesper yearn Building Blocks containing USDT [%] Disentangling Decentralized Finance (DeFi) Compositions • 23 6.1 Insights Cryptoassets are not a niche phenomenon anymore. They reached an overall market capitalization of more than 2T USD (Nov. 2021) and are increasingly interconnected with the traditional inancial systems. With DeFi, we now see the introduction of leveraged inancial products and assets that are backed with some poorly understood virtual securities. Our results provide initial insights into the motivating questions mentioned in the introduction. Concerning ecosystem interoperability, we found that compositions between DEX protocols are particularly frequent in our dataset (c.f. Figure 9). From this, we can conclude that these protocols should ideally be deployed on the same DLT platform as long as single-transaction cross-chain compositions are not possible. At the same time, however, we also found that derivative protocols in particular still contain relatively few compositions. This suggests that, for example, a protocol-type speciic scaling solution could be useful. For example, a sidechain for derivative protocols. Fewer compositions would still be possible, but not with a signiicant negative impact as when separating DEX protocols. As far as integration with web technologies is concerned, the versatile use of building blocks shows that elementary constructs are already reused and integrated by various applications, without this necessarily being transparent to the users. The view is further reinforced when considering that various assets are already integrated into web technologies, but their simultaneous inclusion in inancial instruments and compositions is barely obvious. An example of this is the BAT token, which is integrated into the Brave browser but is also used in various DeFi protocols. Finally, turning to risks through complexity, we recall that the inancial crisis in 2008 has shown that a lack of understanding and lack of regulation can have unforeseeable risks for the inancial markets and our society as a whole. Whilst composability unleashes unexplored possibilities, it may also lead to unforeseen risks. Indeed, despite DeFi protocols are aware of and often even facilitating the use ofCA their s inocomp wn osition with those of other protocols, these interconnected novel inancial services lack a form of coordination on the resulting compositions. Thus, unintended forms of interaction across protocols could take place, exposing users to risk, even more so when calls are iteratively nested and several protocols are indirectly involved. If the DeFi ecosystem evolves at the current pace and integrates closely with the traditional inancial sector, associated systemic risks must be understood and mitigated. Our work shows how DeFi protocols can be decomposed, and the share of protocol interactions can be visualized (c.f. Figure 8). With our case study we simulate a hypothetical run on Tether and show how our method can provide irst insights how DeFi protocols and their services could be afected, also through cascading efects from other protocols. That shows the potential and possibilities for further studies to evaluate systemic risk. 6.2 Limitations We acknowledge and point out some limitations of our work. First, our results naturally relect only the compo- sitions of the protocols and labeled addresses contained in our ground truth dataset. Since the DeFi landscape is evolving rapidly, extending our seed data and the observation period, as well as investigating the temporal evolution of the DeFi protocols, is an obvious next step. One can then re-run our generally applicable analytics procedures. We emphasize however that, while a longitudinal analysis of DeFi usage in a longer time frame would be of interest, our main contribution regards the devised methodology to uncover compositions. The time frame and extent of the DeFi protocol activity we investigated are suiciently large for this (static) analysis. Second, as we focused on composability, we didn’t investigate some features of the network topology, such as their small-world properties (e.g., clustering coeicients and path lengths); we studied recurrent patterns by decomposing individual transactions as nested building blocks, rather than studying triadic (or higher order) motifs and core decomposition methods; Topological Data Analysis (TDA) has been exploited in the literature mostly in predictive models to identify anomalous patterns, which is beyond the scope of our work; similarly, ACM Trans. Web 24 • Kitzler et al. temporal aspects are left for future work, as discussed previously. In our network analysis, we currently neglect edge weights betweenCAs, which may indicate the strength of composition. Including them could also be part of future work. Third, our building block extraction algorithm currently yields the building blocks of known DeFi protocols. We believe that future work should aim at a more systematic evaluation using a curated ground truth of DeFi compositions. Finally, we point out that currently we mainly focus on single-transaction interactions between CAs. However, DeFi compositions could also be constructeEO d by As over time using multiple transac- tions. We do not yet consider this aspect in our analysis, but we deem it one of the most promising avenues for future work. 7 CONCLUSION The overall goal of our work is to provide methods and results that contribute to a better understanding of DeFi protocols, which are a new family of inancial products. We manually curated a ground truth set of 23 DeFi protocols, which can be reused in future research. We constructed network abstractions representing the interactions between smart contracts CA( s) and DeFi protocols and conducted a topology analysis in the timespan from Jan-2021 to Aug-2021. The results indicate the existence of compositions, which is further supported by our inding that known community detection algorithms cannot disentangle DeFi protocols. Therefore, we proposed an algorithm that extracts the building blocks of DeFi protocols from transactions. We assessed the most frequent blocks and found that swaps play an essential role. We also analyzed individual DeFi protocols by disentangling their building blocks and lattened the composition hierarchies of all DeFi protocol transactions in our dataset. We provide a case study, that discovers how the building blocks depend on the USDT stablecoin. This shows how the proposed method can help identify potential systemic risk, by measuring to what extent each protocol is afected by propagating shock of a single entity, originated from vulnerabilities, legal issues or technical advances. Finally, we have discussed the implications and limitations of our work, providing irst insights into questions about interoperability, integration with Web technologies, and systemic risks that may arise in complex inancial systems. In summary, our work is the irst that investigates DeFi compositions across multiple protocols, both from a network perspective and at the level of individual transactions. We believe that our methods make an essential contribution to understanding the bigger picture and the basic building blocks of individual DeFi protocols and their relationships across protocols. REFERENCES [1] Nicola Atzei, Massimo Bartoletti, and Tiziana Cimoli. 2017. A Survey of Attacks on Ethereum Smart Contracts ProceeSoK. dingsInof the 6th International Conference on Principles of Security and Trust - Volume 10204 . Springer-Verlag, Berlin, Heidelberg, 164ś186. [2] Martin Neil Baily, Robert E. Litan, and Matthew S. Johnson.The 2008. Origins of the Financial Crisis . Technical Report. Brookings Institution. [3] Albert-László Barabási and Réka Albert. 1999. Emergence of scaling in random netw science orks. 286, 5439 (1999), 509ś512. [4] Rafael Belchior, André Vasconcelos, Sérgio Guerreiro, and Miguel Correia. 2021. A survey on blockchain interoperability: Past, present, and future trends.ACM Computing Surveys (CSUR)54, 8 (2021), 1ś41. [5] Vincent D Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. 2008. Fast unfolding of communities in large networks. Journal of statistical mechanics: theory and experiment 2008, 10 (2008), P10008. [6] Anna D Broido and Aaron Clauset. 2019. Scale-free networks are rarNatur e. e communications10, 1 (2019), 1ś10. [7] Ting Chen, Zihao Li, Yuxiao Zhu, Jiachi Chen, Xiapu Luo, John Chi-Shing Lui, Xiaodong Lin, and Xiaosong Zhang. 2020. Understanding ethereum via graph analysis. ACM Transactions on Internet Technology (TOIT)20, 2 (2020), 1ś32. [8] Weili Chen, Tuo Zhang, Zhiguang Chen, Zibin Zheng, and Yutong Lu. 2020. Traveling the Token World: A Graph Analysis of Ethereum ERC20 Token Ecosystem. In Proceedings of The Web Conference 2020 (WWW ’20). Association for Computing Machinery, 1411ś1421. DOI:https://doi.org/10.1145/3366423.3380215 [9] Aaron Clauset, Cosma Rohilla Shalizi, and Mark EJ Newman. 2009. Power-law distributions in empirical SIAM reviedata. w51, 4 (2009), 661ś703. ACM Trans. Web Disentangling Decentralized Finance (DeFi) Compositions • 25 [10] Philip Daian, Steven Goldfeder, Tyler Kell, Yunqi Li, Xueyuan Zhao, Iddo Bentov, Lorenz Breidenbach, and Ari Juels. 2020. Flash boys 2.0: Frontrunning in decentralized exchanges, miner extractable value, and consensus instability 2020 IEEE. In Symposium on Security and Privacy (SP). [11] Leon Danon, Albert Diaz-Guilera, Jordi Duch, and Alex Arenas. 2005. Comparing community structure identiication. Journal of statistical mechanics: Theory and experiment 2005, 09 (2005), P09008. [12] DeFi Pulse. 2021. Total Value Locked (USD) in DeFi. (7 2021). https://deipulse.com/ [13] Daniel Engel and Maurice Herlihy. 2021. Composing Networks of Automated Market Makers. arXiv preprint arXiv:2106.00083 (2021). [14] Michael Fröwis, Andreas Fuchs, and Rainer Böhme. 2019. Detecting Token Systems on Ethereum. Financial In Cryptography and Data Security, Ian Goldberg and Tyler Moore (Eds.). Springer International Publishing, Cham, 93ś112. [15] Lewis Gudgeon, Pedro Moreno-Sanchez, Stefanie Roos, Patrick McCorry, and Arthur Gervais. 2020. Sok: Layer-two blockchain protocols. In International Conference on Financial Cryptography and Data Security . Springer, 201ś226. [16] L. Gudgeon, D. Perez, D. Harz, B. Livshits, and A. Gervais. 2020. The Decentralized Financial Crisis. 2020 Crypto In Valley Conference on Blockchain Technology (CVCBT). 1ś15. DOI:https://doi.org/10.1109/CVCBT50464.2020.00005 [17] Lewis Gudgeon, Sam Werner, Daniel Perez, and William J Knottenbelt. 2020. Dei protocols for loanable funds: Interest rates, liquidity and market eiciency. InProceedings of the 2nd ACM Conference on Advances in Financial Technologies . 92ś112. [18] Dongchao Guo, Jiaqing Dong, and Kai Wang. 2019. Graph structure and statistical properties of Ethereum transaction relationships. Information Sciences492 (2019), 58ś71. [19] Campbell R Harvey, Ashwin Ramachandran, and Joey Santoro. 2021. DeFi and the Future of Finance . John Wiley & Sons. [20] Maurice Herlihy. 2018. Atomic Cross-Chain Swaps. CoRR abs/1801.09515 (2018). arXiv:1801.09515 http://arxiv.org/abs/1801.09515 [21] Andrei Kirilenko, Albert S Kyle, Mehrdad Samadi, and Tugkan Tuzun. 2017. The lash crash: High-frequency trading in an electronic market. The Journal of Finance72, 3 (2017), 967ś998. [22] Ariah Klages-Mundt and Andreea Minca. 2021. (In)Stability for the Blockchain: Deleveraging Spirals and StableCr coin yptoAttacks. eco- nomic Systems1, 2 (oct 22 2021). https://cryptoeconomicsystems.pubpub.org/pub/klages-mundt-blockchain-instability. [23] Andrea Lancichinetti and Santo Fortunato. 2009. Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities. Physical Review E80, 1 (2009), 016118. [24] Xi Tong Lee, Arijit Khan, Sourav Sen Gupta, Yu Hann Ong, and Xuan Liu. 2020. Measurements, Analyses, and Insights on the Entire Ethereum Blockchain Network. In Proceedings of The Web Conference 2020 (WWW ’20). Association for Computing Machinery, 155ś166. DOI:https://doi.org/10.1145/3366423.3380103 [25] Yitao Li, Umar Islambekov, Cuneyt Akcora, Ekaterina Smirnova, Yulia R Gel, and Murat Kantarcioglu. 2020. Dissecting ethereum blockchain analytics: What we learn from topology and geometry of the ethereum graph? Pro.ceInedings of the 2020 SIAM international conference on data mining . SIAM, 523ś531. [26] Bowen Liu, Pawel Szalachowski, and Jianying Zhou. 2020. A irst look into deiarXiv oracles. preprint arXiv:2005.04377 (2020). [27] Debasis Mohanty, Divya Anand, Hani Moaiteq Aljahdali, and Santos Gracia Villar. 2022. Blockchain Interoperability: Towards a Sustainable Payment System.Sustainability 14, 2 (2022). DOI:https://doi.org/10.3390/su14020913 [28] Mark Newman. 2018. Networks. Oxford university press. [29] Mark EJ Newman. 2006. Finding community structure in networks using the eigenvectors of matrices. Physical review E74, 3 (2006), [30] Dorcas Ofori-Boateng, I Segovia Dominguez, C Akcora, M Kantarcioglu, and Yulia R Gel. 2021. Topological anomaly detection in dynamic multilayer blockchain networks. Joint In European Conference on Machine Learning and Knowledge Discovery in Databases . Springer, 788ś804. [31] Daniel Perez, Sam M Werner, Jiahua Xu, and Benjamin Livshits. 2020. Liquidations: DeFi on aarXiv Knife-e preprint dge. arXiv:2009.13235 (2020). [32] Farimah Poursafaei, Reihaneh Rabbany, and Zeljko Zilic. 2021. SigTran: Signature Vectors for Detecting Illicit Activities in Blockchain Transaction Networks. InPaciic-Asia Conference on Knowledge Discovery and Data Mining . Springer, 27ś39. [33] Derek de Solla Price. 1976. A general theory of bibliometric and other cumulative advantage pr Journal ocesses. of the American society for Information science27, 5 (1976), 292ś306. [34] Usha Nandini Raghavan, Réka Albert, and Soundar Kumara. 2007. Near linear time algorithm to detect community structures in large-scale networks.Physical review E76, 3 (2007), 036106. [35] Rahul Rai. 2022. The Death Spiral: How Terra’s Algorithmic Stablecoin Came Crashing Down. (2022). https://www.forbes.com/ sites/rahulrai/2022/05/17/the-death-spiral-how-terras-algorithmic-stablecoin-came-crashing-down/?sh=41275c6a71a2 Retrieved on 2022-06-05. [36] Fabian Schär. 2021. Decentralized Finance: On Blockchain- and Smart Contract-Based FinancialFe Markets. deral Reserve Bank of St. Louis Review2 (2021), 153ś74. DOI:https://doi.org/10.20955/r.103.153-74 [37] Cosimo Sguanci, Roberto Spatafora, and Andrea Mario Vergani. 2021. Layer 2 blockchain scaling:arXiv A survpr eyeprint . arXiv:2107.10881 (2021). ACM Trans. Web 26 • Kitzler et al. [38] Amritraj Singh, Kelly Click, Reza M Parizi, Qi Zhang, Ali Dehghantanha, and Kim-Kwang Raymond Choo. 2020. Sidechain technologies in blockchain networks: An examination and state-of-the-art reJournal view. of Network and Computer Applications149 (2020), 102471. [39] Shahar Somin, Goren Gordon, and Yaniv Altshuler. 2018. Network Analysis of ERC20 Tokens Trading on Ethereum Blockchain. In Unifying Themes in Complex Systems IX , Alfredo J. Morales, Dan Gershenson, Carlosand Braha, Ali A. Minai, and Yaneer Bar-Yam (Eds.). Springer International Publishing, Cham, 439ś450. [40] Louis Tremblay Thibault, Tom Sarry, and Abdelhakim Senhaji Haid. 2022. Blockchain Scaling using Rollups: A Comprehensive Survey. IEEE Access (2022). [41] Palina Tolmach, Yi Li, Shang-Wei Lin, and Yang Liu. 2021. Formal Analysis of Composable DeFi CoRRPrabs/2103.00540 otocols. (2021). arXiv:2103.00540 https://arxiv.org/abs/2103.00540 [42] Vincent A Traag, Ludo Waltman, and Nees Jan Van Eck. 2019. From Louvain to Leiden: guaranteeing well-connected communities. Scientiic reports9, 1 (2019), 1ś12. [43] Friedhelm Victor and Bianca Katharina Lüders. 2019. Measuring Ethereum-Based ERC20 Token Networks. Financial In Cryptography and Data Security - 23rd International Conference, FC 2019, Frigate Bay, St. Kitts and Nevis, February 18-22, 2019, Revised Selected Papers (Lecture Notes in Computer Science) , Ian Goldberg and Tyler Moore (Eds.), Vol. 11598. Springer, 113ś129.DOI:https://doi.org/10.1007/978- 3-030-32101-7_8 [44] Victor von Wachter, Johannes Rude Jensen, and Omri Ross. 2021. Measuring Asset Composability as a Proxy for DeFi Integration. In International Conference on Financial Cryptography and Data Security . Springer, 109ś114. [45] Dabao Wang, Siwei Wu, Ziling Lin, Lei Wu, Xingliang Yuan, Yajin Zhou, Haoyu Wang, and Kui Ren. 2021. Towards A First Step to Understand Flash Loan and Its Applications in DeFi Ecosystem. ProceInedings of the Ninth International Workshop on Security in Blockchain and Cloud Computing . [46] Gang Wang. 2021. Sok: Exploring blockchains interoperability Cryptology . ePrint Archive(2021). [47] Ye Wang, Yan Chen, Shuiguang Deng, and Roger Wattenhofer. 2021. Cyclic Arbitrage in Decentralized Exchange Markets. Available at SSRN 3834535 (2021). [48] Ye Wang, Lioba Heimbach, and Roger Wattenhofer. 2021. Behavior of Liquidity Providers in DecentralizearXiv d Exchanges. preprint arXiv:2105.13822(2021). [49] Sam M. Werner, Daniel Perez, Lewis Gudgeon, Ariah Klages-Mundt, Dominik Harz, and William J. Knottenbelt. 2021. SoK: Decentralized Finance (DeFi). (2021). arXiv:cs.CR/2101.08778 [50] Jaewon Yang and Jure Leskovec. 2012. Community-ailiation graph model for overlapping network community dete2012 ction. IEEE In 12th international conference on data mining . IEEE, 1170ś1175. [51] Zhao Yang, René Algesheimer, and Claudio J Tessone. 2016. A comparative analysis of community detection algorithms on artiicial networks. Scientiic reports6, 1 (2016), 1ś18. [52] Alexei Zamyatin, Mustafa Al-Bassam, Dionysis Zindros, Eleftherios Kokoris-Kogias, Pedro Moreno-Sanchez, Aggelos Kiayias, and William J Knottenbelt. 2021. Sok: Communication across distributed le International dgers. In Conference on Financial Cryptography and Data Security. Springer, 3ś36. [53] Dirk A Zetzsche, Douglas W Arner, and Ross P Buckley. 2020. Decentralized inance Journal . of Financial Regulation 6, 2 (2020), 172ś203. [54] Lin Zhao, Sourav Sen Gupta, Arijit Khan, and Robby Luo. 2021. Temporal Analysis of the Entire Ethereum Blockchain Netw Webork. In Conference 2021 (WWW’21). ACM Trans. Web http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png ACM Transactions on the Web (TWEB) Association for Computing Machinery

Disentangling Decentralized Finance (DeFi) Compositions

Loading next page...
 
/lp/association-for-computing-machinery/disentangling-decentralized-finance-defi-compositions-HjAC09Bf0T

References (68)

Publisher
Association for Computing Machinery
Copyright
Copyright © 2023 Association for Computing Machinery.
ISSN
1559-1131
eISSN
1559-114X
DOI
10.1145/3532857
Publisher site
See Article on Publisher Site

Abstract

STEFAN KITZLER, Complexity Science Hub Vienna and AIT - Austrian Institute of Technology, Austria FRIEDHELM VICTOR, Technische Universität Berlin, Germany PIETRO SAGGESE, AIT - Austrian Institute of Technology and Complexity Science Hub Vienna, Austria BERNHARD HASLHOFER, Complexity Science Hub Vienna, Austria We present a measurement study on compositions of Decentralized Finance (DeFi) protocols, which aim to disrupt traditional inance and ofer services on top of distributed ledgers, such as Ethereum. Understanding DeFi compositions is of great importance, as they may impact the development of ecosystem interoperability, are increasingly integrated with web technologies, and may introduce risks through complexity. Starting from a dataset of 23 labeled DeFi proto,663 cols ,881 and 10 associated Ethereum accounts, we study the interactions of protocols and associated smart contracts. From a network perspective, we ind that decentralized exchange (DEX) and lending protocol account nodes have high degree and centrality values, that interactions among protocol nodes primarily occur in a strongly connected component, and that known community detection methods cannot disentangle DeFi protocols. Therefore, we propose an algorithm to decompose a protocol call into a nested set of building blocks that may be part of other DeFi protocols. This allows us to untangle and study protocol compositions. With a ground truth dataset we have collected, we can demonstrate the algorithm’s capability by inding that swaps are the most frequently used building blocks. As building blocks can be nested, i.e., contained in each other, we provide visualizations of composition trees for deeper inspections. We also present a broad picture of DeFi compositions by extracting and lattening the entire nested building block structure across multiple DeFi protocols. Finally, to demonstrate the practicality of our approach, we present a case study that is inspired by the recent collapse of the UST stablecoin in the Terra ecosystem. Under the hypothetical assumption that the stablecoin USD Tether would experience a similar fate, we study which building blocks and, thereby, DeFi protocols would be afected. Overall, our results and methods contribute to a better understanding of a new family of inancial products. CCS Concepts: · Applied computing→ Digital cash; Electronic funds transfer. Additional Key Words and Phrases: Decentralized Finance, DeFi, Blockchain, Ethereum, Networks 1 INTRODUCTION Decentralized Finance (DeFi) stands for a new paradigm that aims to disrupt established inancial markets. It ofers inancial services in the form smart ofcontracts, which are executable software programs deployed on top of distributed ledger technologies (DLT) such as Ethereum. Despite being a relatively recent development, we can already observe rapid growth in DeFi protocols enabling lending of virtual assets, exchanging them for other virtual assets without intermediaries, or betting on future price developments in the form of derivatives like options and futures. The term łinancial legož is sometimes used because DeFi services comp canose bed into new inancial products and services. Authors’ addresses: Stefan Kitzler, kitzler@csh.ac.at, Complexity Science Hub Vienna and AIT - Austrian Institute of Technology, Vienna, Austria; Friedhelm Victor, friedhelm.victor@tu-berlin.de, Technische Universität Berlin, Berlin, Germany; Pietro Saggese, pietro.saggese@ait. ac.at, AIT - Austrian Institute of Technology and Complexity Science Hub Vienna, Vienna, Austria; Bernhard Haslhofer, haslhofer@csh.ac.at, Complexity Science Hub Vienna, Vienna, Austria. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for proit or commercial advantage and that copies bear this notice and the full citation on the irst page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior speciic permission and/or a fee. Request permissions from permissions@acm.org. © 2022 Association for Computing Machinery. 1559-1131/2022/10-ART $15.00 https://doi.org/10.1145/3532857 ACM Trans. Web 2 • Kitzler et al. EOA 1inch KYL KYL Fig. 1. A DeFi composition whereUSDT tokens are swapped againstKYL tokens through the DeFi service 1inch in a single transaction.1inch executes the swap sequentially through the DeFi services SushiSwap and UniSwap, using WETH as an intermediary token. In the transaction trace graph, we can see the user calling1inch the smart contract, which in turn triggers several calls to DeFi protocol-, and token smart contracts. As an example of a DeFi composition, consider Figure 1, which illustrates a user interacting 1inchwith the decentralized exchange (DEX) aggregator Web service . The user holds an amount ofUSDT tokens and wants to swap them to KYL tokens. Using the Web application andeher xternally owned account (EOA), she creates a transaction against the 1inchcontract, which in turn triggers a sequence of two swaps on two DeFi protocols within the same transaction, frUSD omT to WETH on SushiSwapand thereafter fromWETH to KYL on UniSwap. In this paper, we study such single transaction DeFi interactions and the networks that arise when combining multiple DeFi transactions. 1.1 Motivation In 2021, the total value of tokens held by smart contracts underlying the DeFi protocols has reached 106 billion USD [12], demonstrating rapid growth. As composability of DeFi protocols is frequently seen as one of the main advantages (cf. [36]), there are multiple reasons why it is interesting to study DeFi compositions: Ecosystem interoperability. While composability can be seen as an opportunity, single transaction compositions as shown in Figure 1 currently only work within a single distributed ledger. Most of the emerging DLT scaling solutions, such as sidechains 27, 38 [ ], rollups40 [ ], and of-chain networks15 [ , 37], lead to multiple, somewhat isolated DeFi ecosystems. Hence, composability is disrupted, as smart contracts on one platform cannot invoke contract functions on another platform within a single transaction. Understanding which types of compositions are frequently used may help in developing solutions to cross-chain 4, 20, 46[, 52] composability. Until solutions are found, such knowledge can help in deciding which services should be co-located, and which services could be separate. https://app.1inch.io (this and all the following links were accessed on June 15 2022) ACM Trans. Web Token DeFi Protocol User Contracts Contracts Disentangling Decentralized Finance (DeFi) Compositions • 3 Integration with Web technologies. Cryptoassets have started integrating with various Web technologies. For example, the Brave browser includes an integrated cryptoasset wallet and native use of BAT tokens, and various applications from the commercial BitTeorr cosystem ent rely on the BitTorrent Token (BTT). This raises the question regarding the interdependence between DeFi compositions and web technologies. Services like Furucombo already illustrate that almost arbitrary DeFi compositions are constructed through Web interfaces. In order to develop an understanding of this, however, it is important to identify compositions and their points of interaction in the irst place. Risks through complexity. After its deregulation in the early 2000s, the securitization market became more complex and opaque. Financial institutions used new inancial instruments to maximize their exposure in this market. They were based on technical computer models and traded by highly leveraged institutions, many of whom did not understand the underlying models. These instruments were highly proitable, but the lack of any infrastructure and public information about them created a massive panic in the inancial system that began in August 20072[]. DeFi protocols may ofer opportunities, such as technological innovation or new governance models. However, their composability adds additional complexity and opaqueness to an already complex cryptoasset ecosystem, which currently has a market valuation of about 1T USD . If these protocols are not understood and adopted more broadly, they could have unforeseeable systemic efects on inancial markets and our society as a whole, as seen in the 2008 inancial crisis 21]. A re[cent example involving DeFi protocols is the collapse of the stablecoin protocol Terra and its associated cryptoassets LUNA and UST. While the protocol did work as designed, its stabilization mechanism was not robust to signiicant selling pressure in the advent of market participants panicking. This ultimately led to deleveraging spiral 22] destr efe oying cts [ over 30B USD of value within a single week and rendering institutions with large exposures to LUNA or UST insolvent. In addition, the stablecoin UST was used as part of compositions in many other DeFi protocols on the Terra blockchain and through bridges on diferent blockchains, thus afecting the entire ecosystem [35]. Previous work (cf.,10[, 16]) has partially studied risks in the DeFi ecosystem, showing possible strategies that allow rational agents to maximize their revenues by subverting the intended design of DeFi protocols, for example in DEXs and lending protocols. However, none of the existing studies have systematically investigated compositions of DeFi protocols, which form complex, interconnected inancial instruments. 1.2 Contributions Our work aims to analyze DeFi protocols and to develop a novel algorithmic method that helps to understand protocol compositions. We can summarize our contributions as follows: (1) We provide a manually curated ground truth of 1407 addresses from 23 DeFi protocols and deriv ,663 ed,881 10 associated Ethereum smart contracts. These are labels that can be reused in future research. On this basis, we propose two network abstractions, representing interactions among DeFi protocols and smart contracts (Section 3). (2) We study intertwined DeFi protocols from a macroscopic perspective by analyzing the topology of both networks. We ind that DEX and lending protocols have high degree and centrality values, and protocol interactions primarily occur in a strongly connected component. We also ind that known community detection algorithms can only indicate DeFi compositions but cannot efectively disentangle them (Section 4). (3) We address the microscopic transaction level and propose an algorithm for extracting the building blocks of DeFi protocols. We apply the algorithm to all protocol transactions in our ground truth, identify the https://brave.com https://www.bittorrent.com https://furucombo.app https://coinmarketcap.com/charts/ ACM Trans. Web 4 • Kitzler et al. most frequent building blocks, and ind that swaps are the most frequent ones. We show how the observed space of compositions looks like for Aavthe e protocol. Further, we also demonstrate, using 1inchand Instadapp as examples, how to disentangle and visualize the building blocks of a single protocol as a treemap (Section 5.1). (4) We present an overall picture of DeFi compositions by extracting and lattening the entire nested building block structure across multiple DeFi protocols. The results show that DeFi aggregation pr1inch otocols , 0x( or Instadapp) are, as expected, heavily intertwined with many other DeFi protocols, which conirms that our algorithm works as intended (Section 5.2). (5) Finally, we present a case study illustrating how a hypothetical run on the stablecoin USD Tether would afect the building blocks of individual DeFi protocols. (Section 5.3). We detect a comparatively high dependency ofCurveinance building blocks to the USDT cryptoasset. We believe that our results are an essential contribution towards understanding DeFi compositions. On a microscopic level, our proposed methods can be used to assess the composition of individual protocols. On a macroscopic level, they show how DeFi protocols and their implementations are connected with each other. For this paper, we limit our scope to the largest Ethereum Virtual Machine (EVM)-based blockchain Ethereum, but in principle the approach can be used and applied to any other EVM-based platform. For reproducibility of results, we make our ground truth dataset, including the labels as well as our source code, openly available at https://github.com/StefanKit/Untangling_DeFi_Composition. 2 BACKGROUND AND DEFINITIONS We now establish preliminary terms and deinitions that are used throughout this work and introduce the related works. 2.1 Ethereum Account Types Ethereum is currently the most important distributed ledger technology (blockchain) for DeFi 53ser ]. Itvices [ difers from the Bitcoin blockchain conceptually as it implements the so-called łaccount modelž with two diferent account types. An externally owned account (��� ) is a łregularž account controlled by a private key held by some user. A code account (�� ), which is synonymous with the notion łsmart contractž, is an account controlled by a computer program, which is invoked by issuing a transaction with the code account as the recipient. A CA must always be initially calledexternal by an transactionoriginating from��an � , but a CA can itself trigger otherCAs. In the latter case, the interaction, which is also known as łmessage,ž is denotedinternal as an transaction. Several branches of internal transactions with varying depth can follow an external transaction, resulting in cascades, which altogether are calle tracesd. CAs allow users to implement application-layer protocols, which are essentially programs that can follow some standardized interface Tokens . are popularCA-based applications and a way to deine arbitrary assets that can be transferred between accounts. The program behind a token manages token ownership and can implement a standardized interface like ERC20, which deines functions standardizing token transfer semantics. 2.2 Decentralized Finance (DeFi) Protocol A DeFi protocolis an application-layer program that provides inancial service functions such as swapping or lending assets. More technically, we can deine it as follows: Deinition 2.1.A DeFi protocol � is a decentralized application that facilitates speciic inancial service functions deined and implemented by a set of protocol-speciic code accounts. The following properties distinguish DeFi services from traditional inancial services: non- irst, they are custodial, meaning that no intermediary such as a bank or a broker holds custody of users’ funds. Second, they ACM Trans. Web Disentangling Decentralized Finance (DeFi) Compositions • 5 are permissionless , meaning that anyone can use existing or implement new services. Third, transpar they areent, which means that anyone with the necessary technical capabilities and skills can investigate and audit the state of protocols. The fourth is that DeFi protocolscomp are osable. 2.3 DeFi Protocol Compositions The last property,composability , is the most crucial for this work and requires more detailed description: CAs can call each other, and their individual functions can be arbitrarily composed into new inancial products and services (łFinancial Lego49 ž) []. While this analogy is widely used in the literature, to the best of our knowledge, no work investigates which arethe basic composable building blocks of more complex inancial services and how they are related. Harvey et al.19 [ ] refer broadly to composability as asset tokenization and networked liquidity, while Von Wachter et al.44[] conceive composability narrowly as a repeated wrapping operation of tokens resulting in new derivative products. However, as illustrated before in Figure 1, we note that DeFi compositions also involvCA e s, which are not tokens. Also, Engel and Herlihy 13] and [ Tolmach et al.41 [ ] respectively discuss compositions only in the context of automated market makers (AMMs) and of formal veriication CAs relate ofd to decentralized exchanges and lending services, which is again a very narrow conception. Thus, there is no comprehensive, technically grounded deinition for DeFi compositions to the best of our knowledge. For our work, we deine it as follows: Deinition 2.2.A DeFi protocol composition occurs when a protocol-speciic account leverages, within a single transaction, one or more accounts belonging to the same or another DeFi protocol to provide a novel inancial service. 2.4 Related Work Others studied networks closely related to the ones we investigated before us: Guo18 et]al. are[amongst the irst to investigate the Ethereum transaction graph, inding that volumes moved and the numbers of transactions follow a power law distribution, that the component structure follows a bow-tie model, and that negative assortativity is plausibly explained by the presence of service providers such as exchanges. Chen 7]etconduct al. [ a systematic study of Ethereum between 2015 and 2018 and exploit graph analysis measures to describe three diferent network constructions (money transfer, smart contract creation, and smart contract invocation). Another systematic study has been conducted by Lee et al. [24], who analyzed the local and global properties of interaction networks extracted from the entire Ethereum blockchain statically inding heavy-tailed degree distributions. In a follow-up, Zhao et al. [54] analyzed the temporal evolution of Ethereum interaction networks and found that they proliferate and follow the preferential attachment growth model. Furthermore, several studies focus on the network of Ethereum’s tokenized assets: Somin et al. 39],[ for instance, studied the combined graph of all fungible token networks, while Victor and Lüders 43][ explored the networks of the top 1,000 ERC20 tokens individually. Fröwis et al. [14] proposed a method for detecting token systems independent of an implementation standard. Also, Chen et al. [8] conducted a systematic investigation of the whole Ethereum ERC20 token ecosystem and analyzed their activeness, purpose, relationship, and role in token trading. Other studies exploited network methods for the detection of speciic nodes using graph-based approaches. Poursafaei32et] al. dev[eloped a method based on graph node feature extraction and graph representation learning techniques to identify illicit no 25]des. Li et al. [ and Ofori-Boateng et al.30[], instead, respectively use Topological Data Analysis (TDA) to detect price anomalies and hidden co-movement in pairs of tokens, and for anomalous events detection in a multilayer network. However, none of these related works considers networks that represent DeFi Protocols and their relationships. Another growing body of research concentrates on speciic functions ofered by individual DeFi protocols or types of protocols. We are aware of many DEX-related measurements focusing on protocol-speciic aspects, such as the magnitude of cyclic arbitrage activity 47], the[behavior of liquidity providers 48], or the [ role of ACM Trans. Web 6 • Kitzler et al. oracles as providers of external information 26]. Other [ studies focus on lending and borrowing services: Perez et al. [31] analyze liquidations and related participants’ behavior in the DeFi Comp pround otocol , while Gudgeon et al. [17] compare market eiciency, utilization, and borrowing rates in diferent lending protocols. Also, Wang et al. [45] provide methods to identify lash loans in three diferent DeFi providers and measure their related activity. Finally, we are aware that von Wachter et44al. ] inv [ estigate composability from an asset perspective and measure composability by identifying the number of derivatives produced from an initial root asset. However, we apply a more technical, service-oriented perspective and consider, to put it simply, a DeFi composition as being a computer program utilizing other programs’ functions. Overall, we are not aware of previous studies providing a comprehensive picture of DeFi compositions across various protocols. We also do not know any work that analyzes in detail the building blocks of individual DeFi protocols. With this work, we want to close this gap. 3 DATASET AND NETWORK CONSTRUCTION This section describes the data we collected and the network abstractions we constructed for subsequent analysis steps. 3.1 Dataset collection To study DeFi compositions, we are interested in transactions between Ethereum code accounts associated with known DeFi protocols. Thus, we used on-chain transaction data from the Ethereum blockchain and built a ground truth of knownCAs and their associations to DeFi protocols. 3.1.1 On-chain transaction data. While Ethereum’s history goes as far back as July 2015, DeFi only emerged as a popular term around summer 2020, when these protocols irst saw increased usage. This informed our choice of the analysis time frame and the ability to refer to external sources providing information on popular, established DeFi services. We used an OpenEthereum client and ethereum-etl to gather all Ethereum transactions from 01-Jan-2021 (block 11,565,019) to 05-Aug-2021 (block 12,964,999). We collected each external transaction and also parsed its cascade of internal transactions, which together givetrace us the . For each transaction, we extracted the source and destination account addresses, the transaction hash, the transferred value, the transaction type (call, create, or self-destroy), as well as the trace ID, which indexes the transactions by their execution order. Additionally, we collected the method ID of the 4-byte input sequence, which allows us to identify the signature of called methods using the 4Byte lookup service . To distinguish betweCA ens and EOAs, we gathered all code account creation transactions from the CA irst created on Ethereum until the end of our observation period. We also use cr these eation tracesto associate each CA with its creator CA. In total, we found 46 ,112,390 CAs and used the output byte sequence to identify ,324 143 contracts conforming to the ERC20 standard. 3.1.2 Ground truth data. To be able to analyze DeFi protocols, we need a ground truth dataset on which smart contracts are part of a given protocol. We focus on the most relevant protocols regarding valuation and gas-burned between 06-Mar-2021 and 05-Aug-2021, using monthly samples of the top three total-value-locked (TVL) protocols from DeFi Pulsefor each inancial service category. Additionally, we consider protocolsCA including s of the top ten gas burner listin the observation period. The result deines the set of DeFi protocols we want to investigate. Table 1 reports summary statistics for the 23 protocols in our sample, divided by category. The last column https://github.com/blockchain-etl/ethereum-etl https://www.4byte.directory/ https://deipulse.com/ https://ethgasstation.info/gas-burners ACM Trans. Web Disentangling Decentralized Finance (DeFi) Compositions • 7 reports, for each protocol, the average share of each protocol’s TVL with respect to the entire DeFi ecosystem, between March and August 2021. In total, our 23 DeFi protocols cover more than 81% of the entire DeFi TVL. According to DeFi Pulse, in August 2021 more than a hundred DeFi protocols existed, but only around 30 (of which 18 in our sample) had a TVL larger than 200M USD. Most of the protocols in our sample are still the most relevant ones for TVL as of June 2022. In the following, we briely introduce the categories and protocols as reported by DeFi Pulse: • Assets identify the category including cryptoasset management protocols, such as yield aggregators, that aim at maximizing the value of a portfolio or basket of underlying Harvestinance assets. , Yearn, Vesper, share a similar mechanism, whereby they pool resources which are in turn invested in other DeFi platforms according to diferent optimization strategies. Users are typically rewarded through tokenized assets. Convex enables Curveinance liquidity providers to earn additionalBadger rewards. allows Bitcoin users to deposit tokenized Bitcoin such as wBTC and consequently generate a yield, by following programmatic optimization strategies. Similarly RenVM bridges , digital assets across DeFi ecosystems by minting ERC20 tokens on Ethereum with 1:1 ratio Fei . ’s protocol builds on a decentralized stablecoin backed by cryptoassets exploited through yield strategies established by the protocol’s governance. • Derivatives protocols allow issuing synthetic inancial instruments in the DeFi ecosystem, either tracking other cryptoassets or real-world of-chain assets. Synthetix, for instance, supports several real-world assets, such as iat currencies and metals, while dYdX allows investors to trade perpetual positions on the underlying cryptoassets. Hegicenables the issuing of ETH and wBTC call and puts options. Futureswap users can open leveraged long and short positions on cryptoassets. Nexus, instead, provides inancial insurance instruments that cover potential losses users might incur in; similarly Barnbridgeofers , tools to hedge risk through its inancial instruments. • DEXs, i.e. Decentralized Exchanges, allow users to exchange cryptoassets. UniSwap, SushiSwap, Curve- inance, Balancer all exploit Automated Market Makers (AMM), as well as bonding curves and constant functions to algorithmically set the cryptoassets prices, 0xwhile is based on the order book mechanism. The 1inchprotocol aggregates information on liquidity from several DEXs and routes transactions to those ofering the best prices. • Lending protocols provide investors with automated markets for loanable funds: lenders issue interest- bearing instruments and borrowers can take positions, typically conditional to the provision of collateral that covers potential losses. Aave and Compound follow the model described aboMaker ve. users lock their cryptoassets as collateral and receive the DAI token in return. Instadapp follows a more complex scheme and acts mostly as an aggregator of multiple DeFi protocols. After identifying the most relevant DeFi protocols, we manually colle CActe s asso d the ciated with each protocol. Since this information is not available on the blockchain, we rely on of-chain and publicly available sources like protocol websites and available documentation. We resolved conlicts of duplicate CA to proto d col assignments and identical names by querying CA addresses on Etherscan and uniquely assigned each CA address to its original protocol and obtained a unique label. We denote these manually collected datasepeoints d data as and make them available as part of our source code repository. Next, we extended our seed data by implementing a heuristic that uses the creation transactions and iden- tiies the CAs deployed by each seed address. By default, all extended addresses inherit the label and protocol assignments from the corresponding seed address. If the procedure leads to a conlict of labels for an address, 10 out of the irst 11 DeFi protocols for TVL in DeFi Pulse are in our dataset. DeFiPulse reports the protocols divided into ive categories. We don’t include Payment thecategory because services like Polygon provide of-chain functionality rather than composable inancial services or products. https://etherscan.io/ ACM Trans. Web 8 • Kitzler et al. Table 1. Ground truth dataset summary statistics. Seed addresses were manually collected for each DeFi protocol. The extended seed are heuristically derived and include also further created code accounts from the seed addresses. Number of addresses DeFi Protocol type Seed Extended seed External calls % TVL Protocol Badger 64 278 258,773 1.09% Convex 22 131 147,855 1.13% Fei 40 37 146 ,691 0.28% Assets Harvestinance 101 803 119,631 0.46% RenVM 15 15 234,161 0.86% Vesper 44 44 94,189 1.19% Yearn 3 3 243,036 3.54% Barnbridge 40 46 55 ,588 0.17% dYdX 38 38 107,264 0.14% Futureswap 9 10 6484 0.04% Derivatives Hegic 8 8 8372 0.03% Nexus 24 26 20,067 0.57% Synthetix 271 272 611 ,942 2.55% 0x 28 50 2,094,335 - % 1inch 15 10 ,338,305 1,277,641 0.52% Balancer 9 3473 281,530 2.29% DEX Curveinance 163 267 745,672 9.28% SushiSwap 12 1705 ,2026,674 5.37% UniSwap 15 54 ,038 28,394,798 8.30% Aave 157 166 851,578 13.31% Compound 67 65 741,069 11.48% Lending Instadapp 72 32,770 97,080 7.39% Maker 190 231,261 2,992,692 11.77% we preserve the one obtained through the heuristic. Combined with our seed data, these extended addresses form ourextended seed data set. Table 1 summarizes the number of seed and extended addresses collected for each DeFi protocol. It shows that our automated expansion does not increase the number of addresses associated with DeFi protocols for assets and derivatives. However, it massively expands the dataset for DEXs and lending protocols utilizing automated factory contract deployments. In particular, more than 10 million CAadditional s are associated with 1inchdue to the factory contract that deploys gas tokens. The last column shows the number of external transactions directed to each of our DeFi protocols. The distribution is heterogeneous, and again the most relevant categories are DEX and lending. UniSwapis the most frequently appearing one, with a gap of around one order of magnitude to the second one, whichMaker is . 3.1.3 Dataset reduction. As we are only interested in known DeFi protocols, we inally limited and reduced the traces data set to the subset ofprotocol traces, where the initial external transaction originating from an ACM Trans. Web � �� 1 �� 2 �� 7 �� 3 �� 8 �� �� �� 4 5 DeFi Protocol Network ��� �� 1 �� 2 �� 7 ��� �� 3 �� 8 �� ��� 3 �� 4 �� 5 DeFi CA Network Disentangling Decentralized Finance (DeFi) Compositions • 9 Fig. 2. Schematic illustration of constructed networks. The lower-level DeFi Code Account (CA) network represents inter- actions betweenCAs. The higher-level DeFi Protocol Network models relations between DeFi protocols. Lower-levCA el vertices are associated with higher-level protocol vertices. CAs are triggered byEOAs or otherCAs. EOA triggers aCA address in our extended seed dataset. This reduction allows us to investigate and interpret compositions within the context of known protocols. 3.2 Network construction In our analysis, we want to understand and discover relations between DeFi protocols and assoCA ciate s. For d that purpose, as shown in Figure 2, we constructed networks consisting of DeFi traces on two abstraction levels: the lower-level DeFi Code Account (CA) Networkand the higher-level DeFi Protocol Network. The DeFiCA Network includes all known ground truth CAs triggered by external transactions from arbitrary EOA addresses and allCAs subsequently called by cascades of internal transactions. We noteCA that s in the network can or cannot be associated with a DeFi protocol in our ground truth dataset. We construct the network by iltering all internal and external transactions bCA etwsefr enom theprotocol traces. Since repeated usage of DeFi services results in recurring transaction patterns, we aggregate and count transactions with the same source and destination address. The DeFi Protocol Network represents interactions between protocols. We constructed it by merging all DeFi CA vertices associated with the same DeFi protocol into a single node. We note that we modeled both networks as a directed graph, in which vertices represent either a protocol or aCA single . The weighted edges represent the aggregated set of transactions between DeFi protocolsCA ors. 4 TOPOLOGY MEASUREMENTS We now analyze the constructed networks from a macroscopic perspective. Since our research focuses on understanding DeFi compositions, we do not aim at conducting an encompassing study of the entire Ethereum topology, as it was done in previous studies (see Section 2.4). This supports our choice to focus on a narrower number of targeted metrics that provide relevant insights on composability aspects; other approaches that are beyond the scope of our work are discussed in Section 6.2. The analysis of the degree distribution and centrality measures can help identifying the CAs implementing core functionalities, and the reciprocity and assortativity ACM Trans. Web 10 • Kitzler et al. Table 2. Summary statistics of the analyzed networks. DeFi CA network DeFi Protocol network Nodes 2,536,371 43,624 Edges 3,472,757 84,789 Self-loops 6668 146 Average degree 1.369 1.944 Density 5.398e-07 4.456e-05 provide additional insights on the relationships across such CAs. To understand how CAs associated to the same protocols interact with each other, we investigate how the network is separated in diferent components and whether known community detection algorithms identify community structures that overlap or not with the protocols structures. We start by reporting basic summary statistics for the DeFi CA network and the DeFi Protocol network in Table 2. The main diference is in the network dimension, the latter being two orders of magnitude smaller. The presence of self-loops indicates that some contracts include multiple functionalities and thus can also call themselves. Both networks are sparse, as shown by the average degree and density measure, suggesting that CAs tend to interact with only a few other CAs. 1e+00 1e-02 1e-04 1e-06 1e+01 1e+03 1e+05 Degree Fig. 3. Degree distribution of the CA ( ) and Protocol ( ) networks are shown in the plot as cumulative distribution function ˆ ˆ ˆ ˆ (CCDF). The estimated parameters � = (� , �ˆ) are respectively� = (93, 1.69) and � = (25, 1.83). In both networks, ��� �� � high-degree nodes are associated to DEX or lending protocols. For the CA network, they are routing contracts or factory contracts that deploy other contracts. Nodes with high degree are likely to contain core functionalities and thus to play a relevant role in compositions. ACM Trans. Web CCDF Disentangling Decentralized Finance (DeFi) Compositions • 11 Table 3. Likelihood ratio and p-value. None of the reported heavy-tailed distributions is favored over the power law. DeFi CA Network DeFi Protocol Network Exponential R: 1.322, p-val: 0.186 R: 4.753, p-val: 0.000 Lognormal R: -0.406, p-val: 0.685 R: 0.191, p-val: 0.848 Weibull R: 1.122, p-val: 0.262 R: 2.742, p-val: 0.006 4.1 Degree distribution Looking at the total-value-locked at DeFi Pulse, we can observe that some DeFi protocols and their contracts play a major role. This observation suggests that they might implement core functionality, which other protocols in DeFi compositions can in turn utilize. Under this assumption, preferential 3, 33 attachment ] is a plausible [ generative mechanism for both networks. More generally, networks whose degree distribution follows a power −� law, i.e., the fraction of vertices with�degr isegiv e en by �(�) ∼ � for values of � ≥ � , are often associated to ��� ˆ ˆ such generative mechanism. We thus estimate the parameters � = (� , �ˆ) for our two networks and investigate ��� if the power law distribution is a good it. We rely on the methodology introduced by Clauset et al. 9] and [ by Broido et al.6]: [ evidence of scale-free properties exist either when no alternative heavy-tailed distribution is relatively better than the power law or when the power law is a plausible model for the distribution. In the former case, the network Supexhibits er-Weak scale-free structure. In the latter, evidence of scale-free properties is said Weak toif be the tail of the distribution ˆ ˆ contains at least 50 nodes, and Strong if also<2�ˆ < 3 holds. We start by estimating the parameters � = (� , �ˆ) ��� by minimizing the KolmogorovśSmirnov distance between empirical and itte � d data , andfor exploit it to ��� estimate�ˆ through the method of maximum likelihood estimation 9]. We then [ conduct a goodness-of-it test via a bootstrapping procedur�e (= 5, 000). The resulting p-value indicates if the power law is a plausible it (� ≥ 0.1) for the empirical data or not. Finally, we conduct a log-likelihoR)otest d ratio to compar ( e the power law it against other heavy-tailed distributions (i.e., the Exponential, the Lognormal, and the Weibull). A positive value indicates that the power law distribution is favored over the alternative, and the statistical signiicance is supported by a p-value that indicates if the hypothesis R = 0 is rejected�( < 0.1) or not (� ≥ 0.1). Figure 3 shows the power law it for both networks and their estimate � dand �ˆ. Coherently with other ��� studies on the interaction networks from Ethereum blockchain24data ], � [lies around 1.7 and 1.8, thus being slightly smaller than the average values usually found for power law distributions. The hypothesis that a power law distribution is a good it is not plausible for both networks because p-values are 0.020 and 0.035 CA for the and Protocol networks, respectively. Table 3 reports the comparisons with other heavy-tailed distributions and shows that the power law is not signiicantly favored over the Lognormal distribution for both networks, while it is a better it than the Weibull and the Exponential for the Protocol network. In summary, according to the classiication proposed in Broido6et], al. both[ networks have Super-Weak scale-free properties. Table 4 inspects the tails of the distributions and reports the CA tops15 sorted by highest degree: most of the CAs are associated with a few DEX and lending protocols 1inch ( , UniSwap, 0x, Instadapp, Maker). We can hypothesize that they are part of DeFi compositions, which we will explore further in subsequent sections. 4.2 Centrality measures The results in the previous section highlight the relevant role of DEXs and lending protocols. Network centrality measures are another helpful tool to determine which nodes might implement core functionalities. We consider the In degree centrality, as we are interested in identifying relevant contracts that other protocols may use in DeFi compositions. To add further insights, we also provide the results for the Katz and PageRank algorithms. Katz ACM Trans. Web 12 • Kitzler et al. Table 4. First15 CAs by highest degree. Address Label Protocol Degree In degree Out degree 0x00000000000049... CHI Token 1inch ,2713,153 305,627 2,407,526 0x7a250d5630b4cf... UniswapV2Router02 UniSwap ,007 56 1711 54,296 0xc02aaa39b223fe... EtherToken-v4 0x 54,469 45,129 9340 0x5c69bee701ef81... UniswapV2Factory UniSwap ,408 46 26,576 19,832 0x2971adfa57b20e... Mainnet-InstaIndex Instadapp 34 ,497 18,369 16,128 0x4c8a1beb8a8776... Mainnet-InstaList Instadapp ,33 551 16,956 16,595 0x5ef30b99863452... CDP_MANAGER Maker 15,300 8940 6360 0x35d1b3f3d7966a... MCD_VAT Maker 15,214 15,214 0 0xa26e15c895efc0... PROXY_FACTORY Maker 13,718 1 13,717 0x0000000000b3f8... GST2 Token Unknown 13,447 7644 5803 0x11111112542d85... contractAddress 1inch 12 ,371 2073 10,298 0x6b175474e89094... MCD_DAI Maker 12,314 12,314 0 0xdef1c0ded9bec7... ExchangeProxy-v4 0x 11,147 1138 10,009 0x939daad09fc4a9... mainnet-v1-InstaAccount Instadapp 10 ,876 10,876 0 0xfd3dfb524b2da4... N/A Unknown 10 ,554 1547 9007 centrality accounts for the importance of a node’s neighbors. It is an extension of the eigenvector centrality that addresses issues arising with directed netw28 orks ] by[adding a constant initial weight to each node. PageRank takes into account the Out degree of nodes to control for the drawback of the Katz algorithm that peripheric nodes might get too high values if linked to a very central node. The values of each centrality metric are normalized to the range [0,1]. We ind that both networks are dominated by a few nodes with relatively high values (for all centrality measures) with respect to the other nodes; the In degree values are almost always higher than the Katz ones, which in turn are often slightly larger than the PageRank centrality values. Table 5 reports the values for the nodes with the highest centrality in the Protocol (left) and the DeFi CA (right) networks. We show only the irst three nodes because the others have relatively smaller values in comparison. In the Protocol network, the most central nodes are two non-labeleCA d s. When considering the ranking of the nodes in the highest 10 positions for at least one centrality measure, 10 DeFi protocols appear in the highest positions, Uniswap, and in particular, plays an important role. Such protocols are thus heavily used by other non-lab CAs in eledour dataset. Uniswap, 0x and Maker have higher centrality values with respect to the other protocols. The DeFi CA network is dominated by the 1inchfactory contract mentioned in Section 3.1.2 that deploys CHI tokens. Two other nodes with relatively high values are the wETH CA related to 0x and another factory contract associated withUniswap. Considering again the nodes ranking in the highest 10 positions for at least one centrality measure, CAs associated toInstadapp and Maker appear repeatedly. Factory deployer contracts play a major role in the DeFiCA network. Note that, by deinition, such contracts have a high Out degree, as their functional role is to deploy other contracts. Interestingly, the In degree centrality results show thus that they also have a relevant role as recipients of calls by other contracts of the network. In conclusion, these results are consistent with the indings of Section 4.1 in showing that DEX and lending protocols play a major role and may be involved in compositions. ACM Trans. Web Disentangling Decentralized Finance (DeFi) Compositions • 13 Table 5. In degree, Katz and PageRank centrality measures the three most central nodes. For the Protocol network (let), the column Address/Protocol reports the address of non-labeleCA d s or the protocol name associated to the node. For the DeFi CA network (right), the column Protocol_Address reports the protocol associated to CA the and theCA itself. Protocol network DeFi CA network Address/Protocol In degree Katz PageRank Protocol_Address In degree Katz PageRank 0x0000000000b3f8... 1 1 1 1inch_0x00000... 1 1 1 0xcc88a9d330da11... 0.371 0.168 0.111 0x_0xc02aa... 0.148 0.107 0.053 UniSwap 0.313 0.176 0.092 UniSwap_0x5c69b... 0.087 0.064 0.036 4.3 Reciprocity and assortativity Next, we look at two measures that provide information on the relationship between nodes and their neighbors, that is, reciprocity and assortativity. Reciprocity is the likelihood that nodes are mutually linked. Values range from 0 to 1, the former meaning that the network is purely unidirectional, the latter indicating that all links are reciprocated. For both the DeFi CA and the Protocol networks, the values (respectively 0.234 and 0.215) are similar to the one reported in 24],[and we follow their interpretation that the presence of reciprocated links is a potential sign of composability, as it shows that smart contracts tend to rely often on each other. The lower value obtained for the Protocol network could be explained by the presence of many non-labeled (non-protocol-speciic) CAs. If we further reduce the Protocol network by removing all non-labCA eles,dobtaining a graph abstraction of 23 nodes, the reciprocity (0.677) is much higher, indicating that protocols interact with each other more often and in a bidirectional way, a sign that compositions exist. Assortativity is a metric that indicates whether nodes with similar degrees tend to interact with each other > � >(10), or if nodes with high degrees interact more with low degree nodes (> 0 � > −1). Consistently with previous results on the Ethereum transaction network, both networks are disassortative (-0.473 for the DeFi CA network and -0.262 for the Protocol network), indicating heterogeneity and a sign that CAs with high degree are leveraged by many other CAs with a less relevant role in the ecosystem. As shown above, such nodes are often associated with DEX and lending protocols. 4.4 Components Reciprocity shows that protocols interact bidirectionally with accounts related to other protocols. We thus look at metrics providing further insights on how the (code accounts of) diferent protocols fall into distinct disconnected components. We distinguish betwewen eakly connected components, in which all the nodes are connected by a path independently of the directions of the edges, strand ongly connected, which considers the edge direction. For the Protocol network, we ind that the largest weakly connected component is equal to the entire network, while for the CA network, only 34 nodes are outside of the largest component. The remaining nodes fall into 16 components, with a few nodes each. Table 6 lists the three largest strongly connected components. By comparing the number of edges and nodes, we notice that the second-largest component of both the Protocol and CA the network is denser than the other larger components. Additionally, in Figure 4 we illustrate CAho s bw elonging the to diferent protocols map to the ten largest strongly connected components of CA thenetwork. Interestingly, the second-largest component also encompasses the vast majority of protocol interactions. While the largest component is entirely composedCA ofs associated with the 1inchprotocol, in the second-largest component, we ind addresses of all the analyzed protocols exceptRenVM for , which is not present in any of the reported large components. We also ind that all the protocols fall into the second-largest strongly connected component regarding the Protocol network. This analysis shows that interactions among protocols primarily occur in a ACM Trans. Web 14 • Kitzler et al. 10 (9) Node count 9 (11) 1e +05 8 (15) 1e +04 7 (16) 1e +03 6 (20) 1e +02 5 (34) 1e +01 4 (36) 1e +00 3 (5622) 2 (69,116) 1 (305,581) Protocols Fig. 4. Heatmap showing how the addresses associated to diferent protocols fall into the ten largest strongly connected components. The largest component is uniquely composed of 305,581 1inch addresses, while the second collects the vast majority of protocols. Smaller components identify addresses of protocols that do not interact outside of the protocol itself. Table 6. Description of the three largest strongly connected components. For both networks the patern is fragmented, but interestingly the second largest strongly connected components are remarkably more interconnected, indicating that nodes in these components interact with many other nodes, a prerequisite for composition. Largest 2nd largest 3rd largest # Comp. Nodes Edges Nodes Edges Nodes Edges Contract 2,155,707 305,581 611,160 69,116 370,833 5622 11,242 Protocol 33,832 5622 11,242 3948 14,264 36 71 single, large component that is more interconnected than average. Notably, such interactions might indicate the existence of compositions due to the overlapping transaction structure of multiple protocols. 4.5 Community Detection One could naively assume that CAs associated with speciic DeFi protocols form communities in the Code Account network. However, the previous results suggest that the network topology relects DeFi compositions at the level of the community structure. We thus measure how efectively diferent community detection algorithms detect protocols in the DeFi CA network. We follow the approach of Yang et al. 51], [ who provide guidelines for selecting community detection algorithms depending on the size of the network. We analyze the weakest largest component in its unweighted and undirected version with non-overlapping communities using four diferent algorithms: multilev5el ], lab or Louvain el [ propagation34 [ ], leading eigenvector 29],[ and Leiden42 [ ]. Using the labeled addresses in our ground truth dataset, we can verify to what extent � , the set of communities identiied by partitioning algorithms, correspond to � , the set of ground truth communities deined by the individual protocols. We quantify their performance through the normalized mutual information (NMI) , a benchmark measure in the literatur11 e ,[23] that quantiies ACM Trans. Web 0x 1inch aave badger balancer barnbridge compound convex curvefinance dydx fei futureswap harvestfinance hegic instadapp maker nexus sushiswap synthetix uniswap unknown vesper yearn Component ids with total node counts Disentangling Decentralized Finance (DeFi) Compositions • 15 Table 7. Performance metrics for the community detection algorithms. Low F1 Scores indicate either that the algorithms poorly identify communities, or that the network topology reflects a more complex organization at the mesoscopic level. Algorithms Communities Precision Recall F1 Score NMI �/� Louvain 14 0.3896 0.7181 0.2917 0.9241 0.6087 Leiden 10 0.3021 0.8589 0.2879 0.9620 0.4348 Label prop. 53 0.7107 0.6009 0.4892 0.9404 2.3043 Eigenvector 4 0.1696 0.9070 0.1776 0.9495 0.1739 the similarity between the ground truth communities and the identiied communities. In addition, we provide two additional measures: the ratio �/� for the accuracy of the number of identiied communities and the F1 score. We compute the latter similarly50to]:[irst, for each protocol � ∈ � we identify the detected community � ∈ � that maximizes the F1 score. Then, we report average precision, recall, and F1 scores over all communities � ∈ � . Note that we compute the above metrics only on the labele CA d s. The second column of Table 7 reports the total number of communities that include lab CAele s. The d NMI is high for all the protocols, indicating that overall the algorithms correctly partition the network: indeed, all algorithms cluster together CAs create thed by the 1inch deployercontract, and 1inchis by far the largest ground truth community in terms of labeled accounts. On the other hand, the low F1 scores (0.18-0.49) result from a small set of misclassiied ground truth communities (e.g., Compound, DyDx, Fei). Upon closer inspection, we noticed that some protocols map entirely into a few communities dominated by larger protocols (such UniSwap as or Maker), negatively impacting precision, while others are split into diferent communities, afecting 1inch recall. itself has a non-marginal number of addresses that map into other communities. In summary, we see that algorithms work well, with NMI scores above 0.92. However, when considering the imbalance in our dataset (precision, recall), we ind that known community detection algorithms cannot efectively identify protocols as distinct communities, but rather indicate protocol composition patterns. The identiied community structure relects a diferent organization in which protocols are entangled. 5 MEASURING DEFI COMPOSITIONS After analyzing the macroscopic network perspective, we now address the microscopic trace level, where we identify and extract building blocks, i.e. recurring patterns of internal traces induced by proto CAcol-sp s that eciic are found as subpatterns within diferent transactions. The building block detection can help better understand DeFi compositions and identify a variety of risks. We consider a detailed risk analysis to be future work, but can motivate some sources of risk: for example, if security vulnerabilities are identiied in underlying building blocks, they can propagate to higher levels and pose a risk to other DeFi protocols. Atzei1]etanalyze al. [ the security vulnerabilities of Ethereum code accounts and attacks that exploit them. Legal issues may arise, including licensing issues, thereby limiting usability in other protocols. This phenomenon also exists in traditional software . Finally, the technical evolution of a blockchain can also have an impact on the eiciency or security of an existing building block, and here too it is important to identify which protocols are afected. Thus, we propose an algorithm to extract the possibly nested structure of DeFi protocol calls, which may also be used by other DeFi protocols. In contrast to recent works, that have discovered and exposed DeFi compositions, we provide a systematic, automated mechanism to explore them by using building block extraction. We then assess the most frequent building blocks our algorithm identiies and illustrate possible DeFi compositions and show how the DEX aggregator 1inchand the Instadapp protocols use multiple such building blocks of other protocols. Further, we latten the nested structure of building blocks and study the interaction of DEX and lending https://www.techradar.com/news/this-popular-code-library-is-causing-problems-for-hundreds-of-thousands-of-devs ACM Trans. Web 16 • Kitzler et al. services. Finally, we present in a case study the dependencies of DeFi protocol on stablecoins, by using our extracted building block. 5.1 Building Block Extraction Algorithm In order to detect building blocks, we treat individual transactions as trees of execution traces, that is, as an abstraction where the external and all the internal transactions are represented as an edge to a new node (thus, the same CA appears multiple times if executed more than once). We break the trees into subtrees, starting from the tree’s leaves, and identify a building block whenever we encounter a node that is part of a protocol. If multiple protocol nodes exist in a tree, the building blocks can be composed of one another. To obtain the nested structure, we create a hash of each building block and use those hashes to chain nested tree structures. Figure 5 illustrates the process from a high-level perspective. Subigure 5a represents the input, which corresponds to the original transaction trace graph which we’ve also shown in the introductory Figure 1. We aim to identify building blocks that execute the same logic despite being diferent instances involving diferent addresses (i.e., a swap with diferent tokens). We preprocess and generalize the execution trace trees as follows: Preprocessing: In contrast to a graph, like in Figure 5a, an execution tree can have the same node appearing multiple times as a leaf node, efectively having no cycles. Each edge has a trace ID, determining the order of the calls. If a contract address appears in a trace that has been deployed by a factory, we rename it to $protocol-DEPLOYED. Furthermore, we rename all contract addresses asASSET, which fulill the criteria that their smart contract code contains the standard ERC20 token method signatures, and if within the trace, the token contract is called with one such method. The result of these preprocessing steps is shown in Figure 5b. This preprocessing assumes that factory deployed contracts and ERC20 token contracts provide similar functionality. This allows us to generalize the traces, as many similar interactions with various standardized tokens become identical. Building block 3 Building Building block 2 block 1 (b) Conversion to execution tree, renam-(c) Identification of general building (a) Original transaction trace graph ofing factory deployed contracts and as-blocks from protocol nodes with sub- the composition as shown in Figure 1. sets. traces. Fig. 5. A high-level illustration of the building block extraction algorithm. Subfigure 5a represents the input composition. This graph is then converted into an execution tree as shown in Subfigure 5b, such that each node can only have one incoming edge, requiring the duplication of nodes. In addition, the underlying assets (tokens) and factory deployed contracts are renamed. In this example, the trading pair contracts are factory deployed (FD). This allows for the identification of generalized building blocks, as each trading pair only diferentiates itself by the specific assets it is dealing with. The result of the building block extraction is then shown in Subfigure 5c, and is the result of a botom-up processing of the tree, selecting subtrees of known protocol nodes. See Algorithm 1 for more details. ACM Trans. Web Disentangling Decentralized Finance (DeFi) Compositions • 17 Algorithm 1: Building Block Extraction Inputs :(1) Directed, attributed transaction trace tr �e(e�, �, �,� ) with functions � : � → N assigning a unique trace ID, and� : � → N assigning a method ID on the edges of the tree, (2) protocol vertices � Outputs:Lists of building blo � cks , and hashes � 1 � ← ( ); // Init. list of building blocks 2 � ← ( ); // Init. list of building block hashes 3 � ← (��)|∀�� ∈ � : � ∈ � ; // Edges to protocol nodes � � // For each edge to a protocol, get subtree 4 � ← (� )| edges reachable from� for each� ∈ � ; � � � � � � 5 � ← (� [� ]) | ∀� ∈ � ; // edge induced subtrees � � � � � � 6 � ← ilter( � , by=tree-depth, minimum=2); � � � � 7 � ← sort(� , by=tree-depth, how=ascending); � � 8 for � (� , � , �,�) ∈ � do // for each subtree � � � 9 // Compute building block hash with � , � , � � � 10 � ← sort(� , by=�(� ), how=ascending); // Sort edges � � ′ ′ 11 � ← (� , ..., � ) = �|∀�� ∈ � : � ∈ � ; // Vert. list 1 � � � � 12 � ← ��� (�)|∀� ∈ � ; // Outdegree list � ��� 13 � ← � (� ); // Method ID list 14 ℎ ← sha256hash(stringify (� , � , � )); � � � 15 � ← � [� ]; // B. block as vertex induced subtree 16 replace(what=� , in= � , with= ℎ ); � � 17 � ← � ∥ � ; // Append building block � � � ℎ ℎ 18 � ← � ∥ ℎ ; // Append building block hash � � 19 end 20 return � , � Algorithm 1 takes as input a transaction trace tre�e(�,�, �,� ) with two edge attributes: the trace ID �, indicating the order of execution, � ,and indicating the method ID of the executed call. The second input is a list of seed protocol nodes, such as those described in Section 3.1.2. The algorithm outputs a list of building blocks and hashes of such building blocks. We irst setup the output variables in lines 1ś2. We then ind edges to the protocol nodes in line 3 and extract all further reachable edges of these to obtain edge-induced subtrees in lines 4ś5. We ilter them in line 6 to include only those with a minimum depth of 2, such that the protocol node has to make further calls. In line 7, we sort the list of subtrees ascendingly based on their depth. This means small trees are at the beginning of the list, and large trees that may contain these smaller trees are at the end. For each subtree (line 8), we compute a hash in lines 9ś14, highlighted in gray, akin to a tree kernel. To compute the hash, we irst sort the subtree’s edges by order of execution in line 10, and then extract the target vertices of each edge in line 11, essentially excluding the original calling node, which could be diferent in each transaction. For each of those vertices, we compute the outdegree (line 12), and also determine the method ID for each edge (line 13). The hash is then computed from the three aforementioned properties in line 14. Using the target vertices, we retrieve the building block from the original tree (line 15), which may contain leaf nodes of building block hashes ACM Trans. Web 18 • Kitzler et al. swap swapExactETHForTokens swapExactTokensForTokens swap withdraw FD FD FD FD A FD A FD A FD A A FD A A A A A A A A A A A A 1) uniswap (21,769,746) 2) uniswap (6,198,521) 3) 0x (5,910,146) 4) uniswap (1,804,012) 5) sushiswap (1,250,574) uniswapV3SwapCallback swap swap uniswapV3SwapCallback swapExactETHForTokens FD FD A A FD A A A A A 6) uniswap (1,037,881) 7) uniswap (1,007,538) 8) uniswap (857,377) 9) uniswap (848,682) 10) uniswap (810,287) Fig. 6. The 10 most frequently observed building blocks by called root method, root protocol and count. Nodes marked with FD are generalized factory deployed contracts and those marked with A are ERC20 assets. The majority of these building blocks originate fromUniSwap. Note that block 1 of UniSwap is equivalent to number 5 of SushiSwap. This makes sense, as SushiSwap is a fork ofUniSwap. Number 1 is contained in building blocks 2, and 4 ś illustrating an internal composition within the same protocol. Building block 3 represents the withdrawal of Wrapped Ether WETH ( ) and is associated to the protocol0x. Also note that several root methods are identical, yet can lead to diferent types of building blocks. as replacing subtrees in line 16 can lead to nested building blocks. Finally, we append building block and hash to their lists in lines 17ś18. Once all subtrees are processed, the lists are returned in line 20. An example of the algorithm’s result can be seen in Figure 5c, showing three building blocks, one each from SushiSwap, UniSwapand 1inch. Note that the building block of 1inch contains the other two building blocks. 5.2 Building Block Analysis We execute the algorithm on all transactions in our dataset, together with the set of DeFi protocols in our labeled extended seed set (cf. Section 3). We can then count the retrieved building blocks by their hashes, understand their composition, and visualize them. Figure 6 illustrates the top 10 most frequently observed building blocks, of which eight belongUniSwap to . The most frequent building blockUniSwap is a swap, with more than 21 million occurrences. As UniSwapis one of the most popular DeFi protocols, and token swaps are its main functionality, this result shows that the building block extraction is meaningful. We further observe that the swap building block is reoccurring and contained in other patterns that appear frequently. Another relevant block is related to 0x’s Wrapped Ether (WETH), which in our context is not classiied as an asset due to its’ use of withdrawal, a non ERC20 function. In the following, we will provide more insights into the nested structure from diferent perspectives and discuss their interpretations. 5.2.1 Protocol Building Block Composition.Starting from the execution tree structure of each trace, the algorithm identiies subtrees. Those building blocks obtained from Algorithm 1 can contain leaves with hashes that point to other building blocks, leading to a nested structure that still preserves the primary tree structure of the traces. But a single transaction only represents a small snapshot of the entire tree of possible compositions. For a comprehensive image of the DeFi protocols composition space, we have to consider multiple transactions. To observe the space of all possible compositions, we construct a network of overlapping building block trees for ACM Trans. Web Disentangling Decentralized Finance (DeFi) Compositions • 19 Building Blocks EOA of DeFi protocols 0x aave aave balancer compound curvefinance dydx hegic maker sushiswap synthetix uniswap Fig. 7. Illustrating the composition space of Aave as a network tree. Each node represents a building block, each link a possible nested building block, extracted from all transactionsAto ave. We observe for this protocol a maximum depth of seven nested DeFi building block levels. all transaction of the same initial (external) DeFi protocol. For an illustrative example, we used the extracted building block structures of all transaction Aaveto . The network still conserves the tree structure, where each node represents a building block and each link a nested composition, observed in the transactions. Figure 7 shows the Aave network and illustrates its multiple nested levels. Starting from the top with external transactions from EOAs to Aave, a variety of paths and compositions can be seen, presenting the space of all possible compositions, observed from existing transaction data. Nevertheless, this network illustration doesn’t provide a comprehensive picture of the volume (i.e. number of appearances) of those compositions and the number of branches, when a building block calls multiple sub-blocks. We can inspect for each building block the set of contained protocols and the volume of their appearances: the treemaps in Figure 8 illustrate the shares of protocols appearing in the building block structure of a speciic nested level. In Figure 8a we observe the volume of building block calls and associated protocols in the irst level for the protocols1inch. The largest fraction are external transactions that do not contain any other building blocks; this is captured by the box labeled NONE as . All other boxes show instead the share of transactions in which one or multiple DeFi service building blocks are nested. We group them using diferent colors based on the number of unique, distinct protocols that are called in the subsequent building blocks of this level. For instance, yellow boxes indicate the fraction of transactions in which the appearing nested building blocks in the irst level are associated to one single DeFi protocol, while blue boxes represent the fraction in which the building blocks in the irst level are associated to two diferent protocols. We further observe portions of transactions that contain building blocks assigned to more than two protocols within the irst nested level. Moreover, the treemap in Figure 8b show branches in a deeper level within Instadapp transactions. In the fourth level of self-compositions, besides the fraction that does not contain any further NONEblo ), an cke(ven ACM Trans. Web 20 • Kitzler et al. 0 1 1 0 sushiswap 1inch 0x uniswap instadapp NONE 1inch NONE uniswap balancer compound 0x,uniswap 2 3 aave (a) 1inch (b) instadapp→instadapp→instadapp→instadapp Fig. 8. Inspecting the potentially nested building blocks used by the first lev1inch el of (let) and the fourth level of Instadapp (right). The size of each box represents the share of building blocks assigned to one or more unique protocols.1inch For transactions, at the first nested level, about a third of the used building blocks are of one (chiefly other) protocol (yellow boxes). An even bigger fraction can be observed for Instadappbut in the fourth nested building block level. bigger share of building blocks appear that are associated to one single DeFi protocol. We also inspect again the existence of building blocks associated to two and more protocols. These two illustrations in Figure 8 give insights to our systematical investigation on compositions, and show that looking only to selected compositions or single nested levels of DeFi compositions would return a partial picture: interactions among protocols can be iteratively nested one within each other and can take place in deeply nested levels. Therefore a further investigation to disentangle and latten the nested structure is needed. 5.2.2 Flatening Composition Hierarchies.We then want to investigate to what extent the DeFi protocols leverage other protocols to provide their services. That means, we want to identify a mapping of top-level protocols to any of the building blocks they make use of, whether deeply nested or not. To get an overall picture of the DeFi compositions, we latten the nested building block structures. In each transaction, we follow the cascade of nested building blocks and create a mapping from the contained protocol of building blocks to the original DeFi protocol that the external transaction was sent to (the root protocol). If mappings appear multiple times over diferent transactions, we aggregate them. For each root protocol, we can then compute the frequency of associated protocols to contain building blocks over all transactions. The result is a measure that indicates, for a given root protocol, the probability that a certain building block of a DeFi protocol appears anywhere in the (nested) building block structure. In Figure 9 we show the building block appearances of lending, DEX, derivatives and asset protocols with a heat map. Each row corresponds to the external calls to a speciic protocol, and the row entries indicate the frequencies of the occurrence of a protocol’s building blocks. The relative share measurement is the fraction of internal building blocks based on the number of external transactions. We notice that NONE the category indicates the share of transactions for which no building blocks have been found. Most protocol interactions exist within each protocol, visible by the highlighted diagonal elements. This pattern is especially remarkable for derivative protocols. Consider dYdX: all , ee.g., xternal transactions directed to it contain at least dYone dX building block. However, DeFi aggregation protocols such as Instadapp, 1inch, and 0x in particular show extensive use of other DeFi services and thus frequent occurrences of DeFi compositions. This indicates Algorithm 1 works as intended, as, by deinition, aggregation protocols must call other protocols. The frequent appearance of the 0x protocol can be attributed to the popular Wrapped ACM Trans. Web Assets Derivatives DEX Lending Disentangling Decentralized Finance (DeFi) Compositions • 21 Lending DEX Derivatives Assets Others yearn 0 0 0 0 0 0 0 0 0 0 0 0 0 0 100 vesper 0 5 0 2 0 1 19 81 renvm 10 90 harvestfinance 2 4 1 2 2 6 3 0 0 0 61 0 39 fei 0 0 0 0 2 98 convex 2 37 0 1 71 0 29 badger 0 0 0 0 1 2 0 0 0 1 76 0 1 0 24 synthetix 0 3 28 0 72 nexus 81 2 19 hegic 14 64 36 futureswap 0 40 60 dydx 12 0 0 100 0 barnbridge 5 2 0 24 69 uniswap 0 0 0 22 0 0 0 0 82 0 0 0 0 0 0 0 18 sushiswap 1 0 16 0 0 0 58 0 0 1 1 0 0 0 42 curvefinance 1 2 0 0 19 0 0 1 0 4 2 75 balancer 2 0 0 19 65 0 0 1 0 0 34 1inch 1 0 0 20 3 5 3 13 48 0 0 2 1 0 0 0 0 38 0x 0 0 0 14 0 1 1 6 20 0 0 0 0 0 0 0 0 72 maker 0 0 0 5 1 0 1 0 0 0 0 0 0 0 0 0 0 94 instadapp 19 15 92 5 15 8 3 2 3 13 8 2 1 0 8 compound 0 37 63 aave 34 0 0 1 0 0 0 1 0 1 1 66 Protocol building blocks appearance [%] Fig. 9. Appearances of DeFi service building blocks across protocols. The numbers indicate the percentage of transactions in which a building block of a certain protocol is contained. The use of multiple DeFi services can be observed for DeFi aggregation protocols, likeInstadapp, 1inch and 0x. Ether token and itswithdrawpattern, already observed and shown in Figure 6. Further, we note that second0x to, UniSwapbuilding blocks appear in most transactions to the protocols shown in Figure 9. Derivatives protocols have instead little or no further interactions with other protocols, as shown in the row associated with derivatives in the matrix of heat maps, as well as the assets protocols that do not interact heavily with other protocols. 5.3 Case Study: A hypothetical run on the Tether In May 2022, we witnessed the collapse of the Terra ecosystem and its stablecoin TerraUSD (UST), which maintained its peg to the US Dollar through an arbitrage mechanism with the token LUNA. This triggered a so-called stablecoin-run and destroyed over 30B USD of value within a single week. Motivated by this recent demonstration of systemic risk associated with stablecoins, we apply our building block extraction and analysis methods to measure how a hypothetical run on the stablecoin Tether (USD,T)which is the most widely adopted stablecoin in Ethereum, would afect known DeFi protocols based on building block dependencies. We distinguish between directdependencies, where USDT is an explicit part of a building blo indir ck,eand ctdependencies, where USDT appears somewhere in its’ nested building blocks. Starting with the most frequent building blocks (see Figure 6), we analyzed the occurrence of USDT in the regularly used sub-patterns of transactions. We 0xdAC17F958D2ee523a2206206994597C13D831ec7 ACM Trans. Web External transaction to protocol aave compound instadapp maker 0x 1inch balancer curvefinance sushiswap uniswap barnbridge dydx futureswap hegic nexus synthetix badger convex fei harvestfinance renvm vesper yearn NONE 22 • Kitzler et al. USDT included directly indirectly DeFi protocols Fig. 10. Dependencies of building blocks on the USDT crypto asset for each DeFi protocol. Distinguished between direct included asset or indirect through other nested contained blocks. detected USDT in 10 .6% of ‘swap’ building blocks fr UniSwap om (1) and 16.2% fromSushiSwap(5). For the ‘swapExactTokensForTokens’ building block frUniSwap om (2), we ind an even higher direct occurrence of.7% 22 and an indirect dependency of further.2% 21with the nested block structure, containing the before mentioned ‘swap’ building blocks frUniSwap om (1). In order to obtain a broader picture of the dependencies in the DeFi ecosystem, we also analyzed, for each protocol, the fraction of building blocks containing the USDT asset directly or indirectly in more deeply nested blocks. Our results, which are summarized in Figure 10, show that most protocols have rather low dependencies (< 10%). However, 14.2% ofCurveinance building blocks include the USDT asset directly and also the two DEX protocolsUniSwapand SushiSwapstrongly depend on that asset. This is in line with our previous inding that ‘swap’ is the most frequent building block. We further ind Comp thatound and Instadapp building blocks have in comparison high indirect dependencies on the USDT asset. These dependencies indicate how a shock in the DeFi ecosystem, such as a run on a stablecoin, could afect DeFi protocols, directly and indirectly, through their services. Since USDT has become a multi-chain asset, which is also traded and used on other blockchains (e.g., Binance Smart Chain, Avalanche), such shocks could also spread across chains and lead to systemic failures. However, we consider this analysis a irst step towards a deeper investigation of systemic risk and keep a deeper investigation for future work. 6 DISCUSSION In this section, we discuss some of the insights from our analyses, as well as the limitations of our work. ACM Trans. Web 0x 1inch aave badger balancer barnbridge compound convex curvefinance dydx fei futureswap harvestfinance hegic instadapp maker nexus renvm sushiswap synthetix uniswap vesper yearn Building Blocks containing USDT [%] Disentangling Decentralized Finance (DeFi) Compositions • 23 6.1 Insights Cryptoassets are not a niche phenomenon anymore. They reached an overall market capitalization of more than 2T USD (Nov. 2021) and are increasingly interconnected with the traditional inancial systems. With DeFi, we now see the introduction of leveraged inancial products and assets that are backed with some poorly understood virtual securities. Our results provide initial insights into the motivating questions mentioned in the introduction. Concerning ecosystem interoperability, we found that compositions between DEX protocols are particularly frequent in our dataset (c.f. Figure 9). From this, we can conclude that these protocols should ideally be deployed on the same DLT platform as long as single-transaction cross-chain compositions are not possible. At the same time, however, we also found that derivative protocols in particular still contain relatively few compositions. This suggests that, for example, a protocol-type speciic scaling solution could be useful. For example, a sidechain for derivative protocols. Fewer compositions would still be possible, but not with a signiicant negative impact as when separating DEX protocols. As far as integration with web technologies is concerned, the versatile use of building blocks shows that elementary constructs are already reused and integrated by various applications, without this necessarily being transparent to the users. The view is further reinforced when considering that various assets are already integrated into web technologies, but their simultaneous inclusion in inancial instruments and compositions is barely obvious. An example of this is the BAT token, which is integrated into the Brave browser but is also used in various DeFi protocols. Finally, turning to risks through complexity, we recall that the inancial crisis in 2008 has shown that a lack of understanding and lack of regulation can have unforeseeable risks for the inancial markets and our society as a whole. Whilst composability unleashes unexplored possibilities, it may also lead to unforeseen risks. Indeed, despite DeFi protocols are aware of and often even facilitating the use ofCA their s inocomp wn osition with those of other protocols, these interconnected novel inancial services lack a form of coordination on the resulting compositions. Thus, unintended forms of interaction across protocols could take place, exposing users to risk, even more so when calls are iteratively nested and several protocols are indirectly involved. If the DeFi ecosystem evolves at the current pace and integrates closely with the traditional inancial sector, associated systemic risks must be understood and mitigated. Our work shows how DeFi protocols can be decomposed, and the share of protocol interactions can be visualized (c.f. Figure 8). With our case study we simulate a hypothetical run on Tether and show how our method can provide irst insights how DeFi protocols and their services could be afected, also through cascading efects from other protocols. That shows the potential and possibilities for further studies to evaluate systemic risk. 6.2 Limitations We acknowledge and point out some limitations of our work. First, our results naturally relect only the compo- sitions of the protocols and labeled addresses contained in our ground truth dataset. Since the DeFi landscape is evolving rapidly, extending our seed data and the observation period, as well as investigating the temporal evolution of the DeFi protocols, is an obvious next step. One can then re-run our generally applicable analytics procedures. We emphasize however that, while a longitudinal analysis of DeFi usage in a longer time frame would be of interest, our main contribution regards the devised methodology to uncover compositions. The time frame and extent of the DeFi protocol activity we investigated are suiciently large for this (static) analysis. Second, as we focused on composability, we didn’t investigate some features of the network topology, such as their small-world properties (e.g., clustering coeicients and path lengths); we studied recurrent patterns by decomposing individual transactions as nested building blocks, rather than studying triadic (or higher order) motifs and core decomposition methods; Topological Data Analysis (TDA) has been exploited in the literature mostly in predictive models to identify anomalous patterns, which is beyond the scope of our work; similarly, ACM Trans. Web 24 • Kitzler et al. temporal aspects are left for future work, as discussed previously. In our network analysis, we currently neglect edge weights betweenCAs, which may indicate the strength of composition. Including them could also be part of future work. Third, our building block extraction algorithm currently yields the building blocks of known DeFi protocols. We believe that future work should aim at a more systematic evaluation using a curated ground truth of DeFi compositions. Finally, we point out that currently we mainly focus on single-transaction interactions between CAs. However, DeFi compositions could also be constructeEO d by As over time using multiple transac- tions. We do not yet consider this aspect in our analysis, but we deem it one of the most promising avenues for future work. 7 CONCLUSION The overall goal of our work is to provide methods and results that contribute to a better understanding of DeFi protocols, which are a new family of inancial products. We manually curated a ground truth set of 23 DeFi protocols, which can be reused in future research. We constructed network abstractions representing the interactions between smart contracts CA( s) and DeFi protocols and conducted a topology analysis in the timespan from Jan-2021 to Aug-2021. The results indicate the existence of compositions, which is further supported by our inding that known community detection algorithms cannot disentangle DeFi protocols. Therefore, we proposed an algorithm that extracts the building blocks of DeFi protocols from transactions. We assessed the most frequent blocks and found that swaps play an essential role. We also analyzed individual DeFi protocols by disentangling their building blocks and lattened the composition hierarchies of all DeFi protocol transactions in our dataset. We provide a case study, that discovers how the building blocks depend on the USDT stablecoin. This shows how the proposed method can help identify potential systemic risk, by measuring to what extent each protocol is afected by propagating shock of a single entity, originated from vulnerabilities, legal issues or technical advances. Finally, we have discussed the implications and limitations of our work, providing irst insights into questions about interoperability, integration with Web technologies, and systemic risks that may arise in complex inancial systems. In summary, our work is the irst that investigates DeFi compositions across multiple protocols, both from a network perspective and at the level of individual transactions. We believe that our methods make an essential contribution to understanding the bigger picture and the basic building blocks of individual DeFi protocols and their relationships across protocols. REFERENCES [1] Nicola Atzei, Massimo Bartoletti, and Tiziana Cimoli. 2017. A Survey of Attacks on Ethereum Smart Contracts ProceeSoK. dingsInof the 6th International Conference on Principles of Security and Trust - Volume 10204 . Springer-Verlag, Berlin, Heidelberg, 164ś186. [2] Martin Neil Baily, Robert E. Litan, and Matthew S. Johnson.The 2008. Origins of the Financial Crisis . Technical Report. Brookings Institution. [3] Albert-László Barabási and Réka Albert. 1999. Emergence of scaling in random netw science orks. 286, 5439 (1999), 509ś512. [4] Rafael Belchior, André Vasconcelos, Sérgio Guerreiro, and Miguel Correia. 2021. A survey on blockchain interoperability: Past, present, and future trends.ACM Computing Surveys (CSUR)54, 8 (2021), 1ś41. [5] Vincent D Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. 2008. Fast unfolding of communities in large networks. Journal of statistical mechanics: theory and experiment 2008, 10 (2008), P10008. [6] Anna D Broido and Aaron Clauset. 2019. Scale-free networks are rarNatur e. e communications10, 1 (2019), 1ś10. [7] Ting Chen, Zihao Li, Yuxiao Zhu, Jiachi Chen, Xiapu Luo, John Chi-Shing Lui, Xiaodong Lin, and Xiaosong Zhang. 2020. Understanding ethereum via graph analysis. ACM Transactions on Internet Technology (TOIT)20, 2 (2020), 1ś32. [8] Weili Chen, Tuo Zhang, Zhiguang Chen, Zibin Zheng, and Yutong Lu. 2020. Traveling the Token World: A Graph Analysis of Ethereum ERC20 Token Ecosystem. In Proceedings of The Web Conference 2020 (WWW ’20). Association for Computing Machinery, 1411ś1421. DOI:https://doi.org/10.1145/3366423.3380215 [9] Aaron Clauset, Cosma Rohilla Shalizi, and Mark EJ Newman. 2009. Power-law distributions in empirical SIAM reviedata. w51, 4 (2009), 661ś703. ACM Trans. Web Disentangling Decentralized Finance (DeFi) Compositions • 25 [10] Philip Daian, Steven Goldfeder, Tyler Kell, Yunqi Li, Xueyuan Zhao, Iddo Bentov, Lorenz Breidenbach, and Ari Juels. 2020. Flash boys 2.0: Frontrunning in decentralized exchanges, miner extractable value, and consensus instability 2020 IEEE. In Symposium on Security and Privacy (SP). [11] Leon Danon, Albert Diaz-Guilera, Jordi Duch, and Alex Arenas. 2005. Comparing community structure identiication. Journal of statistical mechanics: Theory and experiment 2005, 09 (2005), P09008. [12] DeFi Pulse. 2021. Total Value Locked (USD) in DeFi. (7 2021). https://deipulse.com/ [13] Daniel Engel and Maurice Herlihy. 2021. Composing Networks of Automated Market Makers. arXiv preprint arXiv:2106.00083 (2021). [14] Michael Fröwis, Andreas Fuchs, and Rainer Böhme. 2019. Detecting Token Systems on Ethereum. Financial In Cryptography and Data Security, Ian Goldberg and Tyler Moore (Eds.). Springer International Publishing, Cham, 93ś112. [15] Lewis Gudgeon, Pedro Moreno-Sanchez, Stefanie Roos, Patrick McCorry, and Arthur Gervais. 2020. Sok: Layer-two blockchain protocols. In International Conference on Financial Cryptography and Data Security . Springer, 201ś226. [16] L. Gudgeon, D. Perez, D. Harz, B. Livshits, and A. Gervais. 2020. The Decentralized Financial Crisis. 2020 Crypto In Valley Conference on Blockchain Technology (CVCBT). 1ś15. DOI:https://doi.org/10.1109/CVCBT50464.2020.00005 [17] Lewis Gudgeon, Sam Werner, Daniel Perez, and William J Knottenbelt. 2020. Dei protocols for loanable funds: Interest rates, liquidity and market eiciency. InProceedings of the 2nd ACM Conference on Advances in Financial Technologies . 92ś112. [18] Dongchao Guo, Jiaqing Dong, and Kai Wang. 2019. Graph structure and statistical properties of Ethereum transaction relationships. Information Sciences492 (2019), 58ś71. [19] Campbell R Harvey, Ashwin Ramachandran, and Joey Santoro. 2021. DeFi and the Future of Finance . John Wiley & Sons. [20] Maurice Herlihy. 2018. Atomic Cross-Chain Swaps. CoRR abs/1801.09515 (2018). arXiv:1801.09515 http://arxiv.org/abs/1801.09515 [21] Andrei Kirilenko, Albert S Kyle, Mehrdad Samadi, and Tugkan Tuzun. 2017. The lash crash: High-frequency trading in an electronic market. The Journal of Finance72, 3 (2017), 967ś998. [22] Ariah Klages-Mundt and Andreea Minca. 2021. (In)Stability for the Blockchain: Deleveraging Spirals and StableCr coin yptoAttacks. eco- nomic Systems1, 2 (oct 22 2021). https://cryptoeconomicsystems.pubpub.org/pub/klages-mundt-blockchain-instability. [23] Andrea Lancichinetti and Santo Fortunato. 2009. Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities. Physical Review E80, 1 (2009), 016118. [24] Xi Tong Lee, Arijit Khan, Sourav Sen Gupta, Yu Hann Ong, and Xuan Liu. 2020. Measurements, Analyses, and Insights on the Entire Ethereum Blockchain Network. In Proceedings of The Web Conference 2020 (WWW ’20). Association for Computing Machinery, 155ś166. DOI:https://doi.org/10.1145/3366423.3380103 [25] Yitao Li, Umar Islambekov, Cuneyt Akcora, Ekaterina Smirnova, Yulia R Gel, and Murat Kantarcioglu. 2020. Dissecting ethereum blockchain analytics: What we learn from topology and geometry of the ethereum graph? Pro.ceInedings of the 2020 SIAM international conference on data mining . SIAM, 523ś531. [26] Bowen Liu, Pawel Szalachowski, and Jianying Zhou. 2020. A irst look into deiarXiv oracles. preprint arXiv:2005.04377 (2020). [27] Debasis Mohanty, Divya Anand, Hani Moaiteq Aljahdali, and Santos Gracia Villar. 2022. Blockchain Interoperability: Towards a Sustainable Payment System.Sustainability 14, 2 (2022). DOI:https://doi.org/10.3390/su14020913 [28] Mark Newman. 2018. Networks. Oxford university press. [29] Mark EJ Newman. 2006. Finding community structure in networks using the eigenvectors of matrices. Physical review E74, 3 (2006), [30] Dorcas Ofori-Boateng, I Segovia Dominguez, C Akcora, M Kantarcioglu, and Yulia R Gel. 2021. Topological anomaly detection in dynamic multilayer blockchain networks. Joint In European Conference on Machine Learning and Knowledge Discovery in Databases . Springer, 788ś804. [31] Daniel Perez, Sam M Werner, Jiahua Xu, and Benjamin Livshits. 2020. Liquidations: DeFi on aarXiv Knife-e preprint dge. arXiv:2009.13235 (2020). [32] Farimah Poursafaei, Reihaneh Rabbany, and Zeljko Zilic. 2021. SigTran: Signature Vectors for Detecting Illicit Activities in Blockchain Transaction Networks. InPaciic-Asia Conference on Knowledge Discovery and Data Mining . Springer, 27ś39. [33] Derek de Solla Price. 1976. A general theory of bibliometric and other cumulative advantage pr Journal ocesses. of the American society for Information science27, 5 (1976), 292ś306. [34] Usha Nandini Raghavan, Réka Albert, and Soundar Kumara. 2007. Near linear time algorithm to detect community structures in large-scale networks.Physical review E76, 3 (2007), 036106. [35] Rahul Rai. 2022. The Death Spiral: How Terra’s Algorithmic Stablecoin Came Crashing Down. (2022). https://www.forbes.com/ sites/rahulrai/2022/05/17/the-death-spiral-how-terras-algorithmic-stablecoin-came-crashing-down/?sh=41275c6a71a2 Retrieved on 2022-06-05. [36] Fabian Schär. 2021. Decentralized Finance: On Blockchain- and Smart Contract-Based FinancialFe Markets. deral Reserve Bank of St. Louis Review2 (2021), 153ś74. DOI:https://doi.org/10.20955/r.103.153-74 [37] Cosimo Sguanci, Roberto Spatafora, and Andrea Mario Vergani. 2021. Layer 2 blockchain scaling:arXiv A survpr eyeprint . arXiv:2107.10881 (2021). ACM Trans. Web 26 • Kitzler et al. [38] Amritraj Singh, Kelly Click, Reza M Parizi, Qi Zhang, Ali Dehghantanha, and Kim-Kwang Raymond Choo. 2020. Sidechain technologies in blockchain networks: An examination and state-of-the-art reJournal view. of Network and Computer Applications149 (2020), 102471. [39] Shahar Somin, Goren Gordon, and Yaniv Altshuler. 2018. Network Analysis of ERC20 Tokens Trading on Ethereum Blockchain. In Unifying Themes in Complex Systems IX , Alfredo J. Morales, Dan Gershenson, Carlosand Braha, Ali A. Minai, and Yaneer Bar-Yam (Eds.). Springer International Publishing, Cham, 439ś450. [40] Louis Tremblay Thibault, Tom Sarry, and Abdelhakim Senhaji Haid. 2022. Blockchain Scaling using Rollups: A Comprehensive Survey. IEEE Access (2022). [41] Palina Tolmach, Yi Li, Shang-Wei Lin, and Yang Liu. 2021. Formal Analysis of Composable DeFi CoRRPrabs/2103.00540 otocols. (2021). arXiv:2103.00540 https://arxiv.org/abs/2103.00540 [42] Vincent A Traag, Ludo Waltman, and Nees Jan Van Eck. 2019. From Louvain to Leiden: guaranteeing well-connected communities. Scientiic reports9, 1 (2019), 1ś12. [43] Friedhelm Victor and Bianca Katharina Lüders. 2019. Measuring Ethereum-Based ERC20 Token Networks. Financial In Cryptography and Data Security - 23rd International Conference, FC 2019, Frigate Bay, St. Kitts and Nevis, February 18-22, 2019, Revised Selected Papers (Lecture Notes in Computer Science) , Ian Goldberg and Tyler Moore (Eds.), Vol. 11598. Springer, 113ś129.DOI:https://doi.org/10.1007/978- 3-030-32101-7_8 [44] Victor von Wachter, Johannes Rude Jensen, and Omri Ross. 2021. Measuring Asset Composability as a Proxy for DeFi Integration. In International Conference on Financial Cryptography and Data Security . Springer, 109ś114. [45] Dabao Wang, Siwei Wu, Ziling Lin, Lei Wu, Xingliang Yuan, Yajin Zhou, Haoyu Wang, and Kui Ren. 2021. Towards A First Step to Understand Flash Loan and Its Applications in DeFi Ecosystem. ProceInedings of the Ninth International Workshop on Security in Blockchain and Cloud Computing . [46] Gang Wang. 2021. Sok: Exploring blockchains interoperability Cryptology . ePrint Archive(2021). [47] Ye Wang, Yan Chen, Shuiguang Deng, and Roger Wattenhofer. 2021. Cyclic Arbitrage in Decentralized Exchange Markets. Available at SSRN 3834535 (2021). [48] Ye Wang, Lioba Heimbach, and Roger Wattenhofer. 2021. Behavior of Liquidity Providers in DecentralizearXiv d Exchanges. preprint arXiv:2105.13822(2021). [49] Sam M. Werner, Daniel Perez, Lewis Gudgeon, Ariah Klages-Mundt, Dominik Harz, and William J. Knottenbelt. 2021. SoK: Decentralized Finance (DeFi). (2021). arXiv:cs.CR/2101.08778 [50] Jaewon Yang and Jure Leskovec. 2012. Community-ailiation graph model for overlapping network community dete2012 ction. IEEE In 12th international conference on data mining . IEEE, 1170ś1175. [51] Zhao Yang, René Algesheimer, and Claudio J Tessone. 2016. A comparative analysis of community detection algorithms on artiicial networks. Scientiic reports6, 1 (2016), 1ś18. [52] Alexei Zamyatin, Mustafa Al-Bassam, Dionysis Zindros, Eleftherios Kokoris-Kogias, Pedro Moreno-Sanchez, Aggelos Kiayias, and William J Knottenbelt. 2021. Sok: Communication across distributed le International dgers. In Conference on Financial Cryptography and Data Security. Springer, 3ś36. [53] Dirk A Zetzsche, Douglas W Arner, and Ross P Buckley. 2020. Decentralized inance Journal . of Financial Regulation 6, 2 (2020), 172ś203. [54] Lin Zhao, Sourav Sen Gupta, Arijit Khan, and Robby Luo. 2021. Temporal Analysis of the Entire Ethereum Blockchain Netw Webork. In Conference 2021 (WWW’21). ACM Trans. Web

Journal

ACM Transactions on the Web (TWEB)Association for Computing Machinery

Published: Mar 27, 2023

Keywords: Decentralized Finance

There are no references for this article.