Digital myths: the true cost of a data-driven world
In today’s world, digital tools are everywhere. Whether for work, leisure, or keeping in touch with friends and family, we rely on IT to facilitate our lives in the 21st century. These tools have evolved drastically over time, and nowadays it feels like the digital world is a limitless resource, a cyberspace of endless possibilities with no physical consequences. But is this perception reality, or merely a mirage?
In our Position Paper on Innovation, we highlighted that digitally mature companies were the ones which emerged strongest from the 2020 crisis. They were also the ones most consistently dedicating time to experiment and innovate.
At Mantu, we take on this challenge by emphasizing the need for sustainable innovation in the digital sector. Our goal is to secure the future of the digital sector in a world where available resources are becoming increasingly constrained.
To bridge the gap between where we are today and where we need to be tomorrow, we explore the myths that contribute to a dangerously flawed perception of the digital world.
Myth 1: Infinite data
As the pillar of IT, data has always had a central role in the development of digital products. At the dawn of this era, about 3.75 megabytes (3.75 million bytes) of data could be stored on an IBM 350 storage disk drive the size of a wardrobe, at a cost of around $3,200 a month. In today’s world these numbers may seem unbelievable, as we now have images ‘heavier’ than this storage wardrobe.
As the years passed, technology improved, allowing more and more storage. We now talk about personal storage capacities in terms of gigabytes and terabytes, equivalent to one billion or one trillion bytes respectively. The scale of data usage can also be measured through the internet traffic. According to a Cisco study from 2018 1, globally 4.7 zettabytes (4.7 sextillion bytes) of traffic have flowed over the internet from 1984 to 2018. It is predicted that in 2022 alone, traffic will exceed that of all previous years up to 2016 combined.
As for the quality of this huge quantity of data, the trend is moving in the opposite direction. If we measure data quality through the prism of its usage, according to a study done by Active Archive Alliance 3, 80% of data is never accessed or used after it is stored. Moreover, according to technology specialist Lucidworks4 only 10% of data collected is analyzed by businesses. In other words, effort, money, time, and resources are invested in order to directly produce e-waste.
How did we get to such absurd statistics? The cheap price of storage5 is the primary reason. Nowadays, storing data costs less than editing, cleaning, and getting rid of it once it has become obsolete. Failure to evaluate the price of the entire data lifecycle piles up costs and problems for the future.
What about data science and BIG data? These fields are not magic, and their usage doesn’t remove the need to manage the data lifecycle. Editing and cleaning are indispensable steps to make data exploitable and most importantly, reliable. Incorrect data, wrong formats, and incomplete values must all be addressed to avoid having false conclusions in your analysis. The mirage of infinite data may persist, but at a cost that we barely manage to grasp.
Mantu brand Amaris Consulting tackles this problem through its green IT offer by integrating sustainable best practices when it comes to software engineering.
Myth 2: Digital as an energy savior
The digital sector relies heavily on energy usage, but how much energy does it really consume? In a world where the effects of climate change mean urgent action must be taken to reduce and optimize energy use, the digital sector’s hunger for energy is skyrocketing.
To illustrate this consumption, we mostly focus on the areas of manufacture and usage. The end-of-life phase of a product also has a footprint, but according to The Shift Project6 the data present today is not sufficient to quantify the impact. Historically, we have measured energy consumption only during the usage phase of a product. As the complexity of products increases and the lifespan decreases, limiting the energy consumption to the usage phase becomes insufficient.
Compare the manufacturing process of an internal combustion engine vehicle with a lifespan of 8 – 15 years and a smartphone with a typical lifespan of 2 to 3 years7, 8. The Shift Project highlights6 that with the increase in complexity and the miniaturization of the products, we reached an amount of energy 80 times bigger for 1g of a smartphone than 1g of a car. This explains why up to 94% of the total energy consumed by a smartphone during its entire lifespan comes from the manufacturing process before it is even first turned on6. This percentage doesn’t take into account the energy usage of the network and data centers triggered by smartphone usage.
Even though the construction phase is the most energy consuming in the lifecycle of a technological product, the usage part should not be underestimated.
Important progress has been made in energy efficiency, and this is mostly seen in the technological and logistical optimizations of data centers, the backbone of the IT sector. The International Energy Agency has a comprehensive report outlining the stability of energy consumption on data centers despite traffic increase9. However, trends like the increase of volume on the network and of data that needs to be stored counteract the progress made. A report for the US Department of Energy concludes that efficiency strategies have theoretical and practical limits and that these limits might be reached earlier than expected10.
500 million more trees will be needed to compensate for the carbon generated by the predicted internet consumption of the United States alone.Gerry McGovern, “World Wide Waste”, Silver Beach Publishing, 2020
Regarding internet traffic, estimations made by Cisco1 for the United States and Southeast Asia highlight a three and four fold growth respectively by 2022 compared to 2017. This means reaching a volume of around 300 gigabytes per month for each US citizen: 500 million trees will be needed in order to compensate for the carbon footprint of the predicted internet consumption of the United States alone2. Additionally, as we highlighted in our article Back to nature: rewilding & businesses, the business of offsetting emissions through the purchase of carbon credits lacks standardization and traceability. All in all, the compensation strategy of our future internet usage seems completely unrealistic.
The latest IPCC report of 2021 states unequivocally that greenhouse gases (GHG) are primarily responsible for the climate changes we are witnessing today. The digital sector also has its share of responsibility, contributing an estimated 3.5-3.8% of the global GHG emissions in 2019, surpassing the civil aviation sector since 2018 11, 12. Even more worryingly, the GHG footprint seems to follow an exponential trend, with a 6% growth per year and the risk of doubling the total impact by 202511.
Despite common belief, digital has become a significant consumer of energy. We believe that innovation will therefore mean taking a holistic approach to evaluate the impact of the digitalization process by taking into consideration a full lifecycle analysis of products.
The energy consumption of digital infrastructure is a topic of high interest for Mantu and is currently an area of significant focus for our dedicated team, making the knowledge available for our partners through Amaris’ Green IT offer.
Myth 3: Free products everywhere
It’s a pretty safe bet to say that almost everybody using a computer or a smartphone uses some free applications. There is a well-known saying on the web – “if you’re not paying for it, you’re the product”. In her book “The Age of Surveillance Capitalism”, Shoshana Zuboff, emerita professor at Harvard Business School, put it even more bluntly – we are not even the product, rather “the objects from which raw materials are extracted and expropriated for prediction factories” 13.
The free economical model is mostly based on advertisement. Many popular websites on the internet have third-party applications hanging on them to which collected data is being sent.
Moreover, ad targeting strategies are volume oriented. For instance, a 2015 study on 58 online advertising campaigns highlighted that 50% of these campaigns had fewer than one purchase per one million visits. This conversion rate seems so low that some experts in the field seriously question the feasibility of such a volume-oriented strategy. It is hard to precisely estimate the pollution of this sector but what can be stated is that it is opaque for the end user.
Despite the democratization of technology, free products encourage overconsumption and we lose the sense of the service’s value. If users really find a product useful, they would be willing to pay for it – for creators continuing this system of subsistence, their revenue is being generated not by the real added value they offer, but from advertising.
Having free products everywhere has major effects on the sustainability of our society by being privacy-invasive, producing hidden pollution, and putting social pressure on creators. When it’s free, the Earth pays, and our society pays.
At Mantu, through our brand Resp3ct, we aim to review the concept of marketing. Our unique approach focuses equally on the impact marketing has on the triad Planet-People-Performance.
Myth 4: The magic of AI
Artificial Intelligence is often seen as the major technological evolution of the last decade. Despite being prominent in our daily lives, there is still a constant debate on the definition of AI15 and this fact could already raise some questions.
Machine Learning, probably the most promoted sub-area of AI, relies much more on data exploitation than traditional software development. To put it briefly, any ML based algorithm is trained to learn patterns from past data. AI is a complex field which makes some simplifying assumptions about the world in which it is deployed. Moreover, the AI based system is a dynamic one as it can evolve together with its underlying data.
“AI is neither artificial nor intelligent”Kate Crawford, principal researcher at Microsoft
There is also a more cynical way of seeing AI. Kate Crawford, a principal researcher at Microsoft, states in her book16 that “AI is neither artificial nor intelligent”. She claims that this system is made from natural resources, human labor, infrastructures, data sets, predefined rules, classifications, and last but not least, computationally intensive processing accessible to a minority of companies. The complexity of the system therefore goes far beyond the technical particularities of some algorithms.
A recent study17 dove into the energy consumption of the training phase of widely used AI systems in Natural Language Processing (NLP). This phase is undoubtedly the most computationally intensive and thus has a significant environmental impact. For example, enhancing a popular NLP model called Transformer with an approach known as Neural Architecture Search emitted carbon equivalent to 315 trans-American flights.
Nevertheless, this impact is only one piece of the puzzle when analyzing an AI. Such systems often come with deep and complex social and societal implications. The most well-known example is probably Tay, the chatbot developed by Microsoft in 2016, which acquired an inappropriate behavior in around 24 hours after interacting with users on Twitter. Since then, numerous examples of AI systems with significant social implications were created in areas like healthcare, recruitment, criminal justice, banking, etc. Fairness is a critical point that must be taken into consideration so that AI systems don’t become a vector for increased economic or social inequality.
AI has an immense potential and this domain hasn’t yet arrived at its maturity. However, a full lifecycle analysis, which balances the externalities it levies to the benefits offered, needs to be taken into consideration when choosing such a system. Policy makers, developers, product owners, and regulators all need to contribute to building an adequate framework. In this context, sustainable AI is focused on addressing the whole sociotechnical system of AI: design, training, development, validation, re-tuning, implementation, and use of AI18.
The Mantu Innovation Lab is focusing on these aspects, establishing not only the pure usage of AI and its applications but also its implications at a social, societal, and environmental level.
The benefits of digital transformation to humanity are undeniable and we could hardly imagine our current society without them. However, digital is physical. It is not in the clouds and it is not an ephemeral tool exempt from social, economic and environmental consequences. At Mantu we think that further innovation in this sector will mean boldly facing the previously mentioned issues and finding solutions for a more inclusive, fair, and sustainable future.
Innovation Engineer at Mantu Innovation Lab & Responsible for AI & Sustainability
Find out more about the Mantu Innovation Lab here.
1 “Cisco Predicts More IP Traffic in the Next Five Years Than in the History of the Internet”, 2018, https://newsroom.cisco.com/press-release-content?type=webcontent&articleId=1955935
2 Gerry McGovern, “World Wide Waste”, Silver Beach Publishing, 2020
3 Active Archive Alliance, “Active Archive and the State of the Industry”, 2018, https://activearchive.com/wp-content/uploads/2018/04/Active-Archive-and-the-State-of-the-Industry-2018_Final.pdf
4 Lucidworks, https://lucidworks.com/darkdata/
6 The Shift Project, “Lean ICT: Towards digital sobriety”, 2019
9 IEA, “The Carbon Footprint of Streaming Video: Fact-Checking the Headlines”, 2020, https://www.iea.org/commentaries/the-carbon-footprint-of-streaming-video-fact-checking-the-headlines
10 Arman Shehabi et al, “United States Data Center Energy Usage Report”, 2016
11 The Shift Project , “Impact Environnemental du numérique: Tendances à 5 ans et Gouvernance de la 5G”, 2021
12 GreenIT.fr, “Empreinte environnementale du numérique mondial”, 2019
13 Shoshana Zuboff, “The Age of Surveillance Capitalism”, Profile Books Ltd, 2019
14 Trent Walton, “Third Parties and the Fate of the Web”, 2019, https://noti.st/trentwalton/nD7VaO#sN0BTEb
15 Azeem Azhar and Rumman Chowdhury on Exponential View podcast, min 5, 2021 https://open.spotify.com/episode/1UI549SnFfJ1efoZIggB8B?si=3f2e2dc7df6b4c3f
16 Kate Crawford, “The Atlas of AI”, Yale University Press, 2021
17 Emma Strubell et al, “Energy and Policy Considerations for Deep Learning in NLP”, 2019 https://arxiv.org/pdf/1906.02243.pdf%22%3EWachstum
18 Aimee van Wynsberghe, “A. Sustainable AI: AI for sustainability and the sustainability of AI”, 2021, https://doi.org/10.1007/s43681-021-00043-6