Generative AI’s physical infrastructure, explained

What needs to happen in the physical world so Chat-GPT can summarize an article for you?

Sometimes, it is easy to forget that the technology we take for granted isn’t magical. The internet relies on a web of under-sea cables that make it possible for us to instantly see content from across the world and mobile phones require cell towers and data centers to stay “mobile”.

The same goes for generative AI models. While opening up Chat-GPT and asking it for a summary of a paper you can’t be bothered to read seems like a simple affair – a lot needs to happen before text appears on your screen. You may have heard that one Chat-GPT prompt equals a bottle of water. You may have also heard that that statistic is not true. Its hard to simplify something as complex as the generative AI supply chain – but that doesn’t mean it’s not worth trying.

This text will, hopefully, make it easier to understand how the infrastructure of generative AI works and why understanding it matters. First, we will look at how generative AI models like Chat-GPT work. Then, we will break down its underlying infrastructure – looking at computing power, information and data centers. Finally, we can explore why understanding the physical realities of modern technology matters.

How does generative AI work?

When you ask Chat-GPT to summarize an article, the model doesn’t follow the same steps you would have to. It doesn’t have to actually read the article or draft a list of main points before it writes a summary.

Instead, it combines the data it was trained on and your prompt to calculate what is the statistically most likely string of words that could answer your question.¹

The data it was trained on doesn’t serve as a library of references that it can consult. Chat-GPT doesn’t look up an answer for you in its database – this is what sets it apart from search engines like Google. Instead, it uses data it has access to as a basis for calculating probability – in other words, for guessing.² And while you can make (un)educated guesses easily, ChatGPT needs a hell of a lot to happen before it gives you an answer that might be correct.

In order for a large language model (in this case, Chat GPT) to work, it first needs to be trained. Training the model starts with an algorithm – an equation with undefined coefficients. The model is then given data to determine what coefficient values fit the equation best. Once it has been exposed to enough data, the model gets better at making consistent guesses – it figures out what is statistically the most likely answer.³ If the purpose of the model is to guess which picture shows a dog, it needs to be exposed to enough pictures of different breeds to “learn” what makes up a dog.

First calculations are bound to have inaccuracies and inconsistencies which need to be corrected. Following the previous example, the model might initially conclude that a dog has four legs – it then needs either direct input from the programmer, or more data, to learn that an amputee chihuahua is indeed still a dog. After enough corrections, the model “learns” to recognise patterns and understand context enough to generate a plausible answer.⁴

In reality, the model doesn’t “learn” – it calculates. And it calculates A LOT.

1. Compute Power (or, more famously, chips)

Calculations need computing power. Power is provided by chips. In the same way that I can’t run a proper video game on my 100 euro laptop without the risk of it catching on fire, AI cannot be trained on just any computer chips.

It is mostly trained on graphics processing units (GPUs) which makes them a good starting point for looking into the materials and infrastructure needed for computing.⁵

GPU’s used for AI are made of an array of materials, but a couple of them are worth singling out. The base for GPUs are silica gel and quartz, which are turned into silicon wafers. Alternatively, germanium might be used. The silicon then needs to be “doped” with an array of minerals to supercharge its conductivity.⁶ Copper clad laminates, cobalt, tin, tantalum and tungsten are also necessary for the construction of chips.⁷

Silicon is the second most common element found in the Earth’s crust, meaning supply isn’t really an issue.⁸ However, it’s only useful for the production of electronic devices when it is in its purest form, which requires significant energy and resource-intensive mining and refining. These processes lead to deforestation, soil erosion and disruption of water supply, as well as degradation in air quality.⁹

Germanium, on the other hand, is extremely rare. Germanium can only be extracted as a side-product of extraction of other elements (mainly zinc), and as such it is in significantly shorter supply compared to silicon. Importantly, germanium has a recycling rate of somewhere between 5 to 25 percent – a significant issue considering its short supply and importance in both the technology and defense sectors.¹⁰

Copper and cobalt, tightly linked in their production and both crucial for new technologies, are important for AI chips as well.¹¹ Around 70% of cobalt is mined in the Democratic Republic of Congo (DRC), where the mining expansion has been linked to forced evictions of the local populations, as well as environmental degradation.¹² Around 40 000 child workers are part of the cobalt production in DRC, and cobalt mining is an important source of funding for the ongoing conflict¹³, which has resulted in war crimes, including ethnic and sexual violence¹⁴.

Mining of tin, tantalum and tungsten is also highly problematic. Significant suppliers of those minerals are found in China, Rwanda, Democratic Republic of Congo and the Great Lakes region in Africa. Their mining tends to lack in regulation and safety oversights, and the world-wide demand enables armed groups to engage in highly lucrative trade.¹⁵ The profits made continue to sustain numerous conflicts across the world – many of which you can read about here and here.

These raw materials are not mined for AI chips alone, and their impact is linked to their importance for most of today’s advanced technology. However, GPU’s are an important piece of this demand due to a rapid increase in their production, and their short life span.¹⁶

2. Data sets

Second crucial component of AI training is data. While information might seem immaterial, a lot of labor and resources go into making it usable for the purposes of training AI.

Generative AI gets trained in three phases; First, it “learns” by itself, using data from the internet. However, random internet data is messy, and generative AI models have a hard time making sense of it. In order to be as accurate as possible, the models need to be tweaked in the process of “supervised learning”.¹⁷ During this step, the model compares how it labeled data against how an actual human would label the same dataset. Although the “supervised” part of this step invokes a direct contact between human labelers and the model, this is often far from the truth. In majority of the cases, this process relies on pre-labeled data sets – a hot commodity in the AI supply chain.¹⁸

Labeling data sets is essentially a more complex version of what you have to do when Google asks you to “select all images that contain a motorbike” when you try to log into your e-mail. An AI model that has only been trained through self-supervised learning might not be able to differentiate between an actual motorbike and a framed picture of a motorbike – it needs additional information. This additional information is provided via pre-labeled data sets.

Pre-labeled sets are bought and used by the AI companies, but they are sold by gig-work companies and produced largely by underpaid workers in the Global South.¹⁹

Companies like Taskup, Remotasks, and Amazon Mechanical Turk are platforms for crowd-sourcing tasks. Instead of directly hiring people through a contract – which would give workers a certain level of protection and stability – they allow companies to list tasks online and outsource them to on-demand human laborers. These companies have proven to be incredibly valuable for training of generative AI as they are able to cheaply produce vast amounts of human-labeled data, whenever AI companies need it.²⁰ The price for this convenience is paid by the workers themselves.

Data labelers are at a mercy of the platform, which dictates when they can get access to new tasks. This, for many, means having to be on alert 24/7, ready to drop everything (including sleep, food or bathroom breaks) whenever a new assignment pops up. The platforms reward those that can work at any time, for hours at a time. The reward is not financial compensation, but access to new tasks – in turn, one poorly timed bathroom break can significantly impact all future work opportunities of a data labeler. Another factor that can significantly impact a labelers ability to make a living is the fact that AI companies do not have a constant need for their services – work can be abundant one day and completely dry out the next.²¹

The data which requires labeling can include depictions of violence, rape, bestiality, murder and hateful speech – significantly impacting worker’s mental health. The work itself is redundant, while at the same time requiring a high level of concentration and alertness (since an error in labeling can result in lack of future work).²² Many labeling gigs require workers to go through unpaid task-specific training, only for it to result in minutes of paid work. When they do get to paid tasks, labelers don’t know what the data will be used for, nor which company it will get sold to.²³

Unsurprisingly, the work is also poorly compensated. Due to the flexible nature of platforms such as Remotasks, the industry can relocate on a whim – based on where labor is the cheapest.²⁴ Kenya was a hotspot for data labeling for a couple of years. However, when the workers started to unionize, Remotasks quickly moved its operations to Ghana – leaving many trained and experienced labelers without jobs overnight.²⁵

The industry preys on the weak. This includes moving operations to countries in crisis (which are more likely to have workers happy to work for any compensation, no matter how meager),²⁶ using child labor, and exploiting the ability to exit the market at any point, to circumvent any attempts at regulation.²⁷

The work has been equated to modern slavery – a comparison which is only reinforced by the fact that the founder of ScaleAI (a subsidiary of Remotasks) became the youngerst “self-made” bilionare in the world at age 24.²⁸

For something that claims it will replace human labor, generative AI sure does require a lot of it. AI enthusiasts argue that this labor is only temporary, and that soon AI will be able to train itself. While there has been some effort to decrease the need for human work, it is unlikely that generative AI will be able to operate without any “human in the loop”.²⁹

3. Data Centers

Everything discussed up to this point comes together in data centers. The chips are put together into servers, the datasets are stored in storage drives, and these supercomputers are connected and kept running by networks, cooling systems and security software.* The result are industrial buildings, scattered across the world – the backbone of the modern, tech-driven world.³⁰

Data centers make sure that all the information needed for you to access a certain service is always online and always running.

Although data centers are in wide use already, the regular ones are simply too slow and too small to train generative AI. This is why we’ve seen a surge of investment into AI data centers – ones that have to account for generative AI’s need for more compute, more energy and more memory.³¹

The extra computing power is ensured by switching from central processing units (CPUs) used in regular data centers, to GPUs.

Because GPUs are significantly more power hungry than CPUs, AI data centers require much more energy. While a rack of servers in a regular data center uses anywhere from 5 to 10 kilowatts at any moment, a single rack in an AI data center needs between 40 to 110 kilowatts to run. This means that at the high end of their power use, the rack at the standard data center equates to keeping 166 light bulbs turned on, while the one at the AI data center comes closer to 1800. How many racks are in a data center is impossible to know for certain, as companies don’t disclose that data for security reasons (but you can read a good estimation here).³² The difference is significant, and AI data centers have been an increasing burden on power grids – unsurprising, considering that this type of data center can consume as much electricity as 50,000 homes.³³

The issue is best demonstrated in the eagerness of data companies to solve it. Microsoft has signed a deal to purchase all power a currently shattered nuclear plant at Three Mile Island will produce once it reopens; Google is investing in small modular nuclear reactors which it could place near its data centers, with Meta following suit. Big Tech companies are also investing in alternative solutions like geothermal projects and fusion power plants.³⁴

All of these projects are focused on the future – on phasing out the use of fossil fuels. However, they do not take away from the fact that currently generative AI’s need for power is delaying the planned shut-downs of some coal power plants and raising prices of electricity for people living near the centers.³⁵ Additionally, even if all energy used for generative AI is clean, it is still a substantial potion of overall supply – raising the question of whether it would be better used somewhere else.

There have been some indications that models might become more efficient, lowering their need for power.³⁶ If the energy required could be lowered, it is questionable whether the overall demand would go down or whether companies would increase the scale of their operations.³⁷

Second important difference between regular data centers and AI ones is in their cooling systems. Since CPUs in regular data centers need less power, they also generate less heat. In turn, they can be cooled with air ventilation. On the other hand, GPUs heat up significantly more and require liquid cooling.³⁸ Training the GPT-3 language model (which is already outdated, and has been replaced) has evaporated around 700,000 liters of clean freshwater.³⁹

High water usage is particularly concerning in areas facing water scarcity – and those tend to be the areas most attractive for building AI data centers since dry, hot climates lower the risk of metal corrosion.⁴⁰

Spain is a good example. Amazon’s data centers in Aragon, a region facing water scarcity, are licensed to use around 755,720 cubic metres of water a year. The same amount of water would be sufficient to irrigate more than 200 hectares of corn – the region’s main crop. Amazon is planning on building three additional data centers in the region and has put in a request for permission to increase water consumption by 48%.⁴¹

Once again, the scale of the issue is apparent in the various attempts the leading companies are making to solve it. Some advances have been made in making cooling less water-reliant (mainly by introducing direct-to-cell cooling systems) which seem promising but have not yet been put to use.⁴²

However, even if that issue is solved in due time, water consumption doesn’t end with cooling. Nuclear energy, for example, is particularly water-intensive. Often, “solving” one environmental issue means creating another.⁴³

So, what about it? Why does any of this matter?

Sometimes, it is easy to forget that the technology we take for granted isn’t magical. It also isn’t inevitable.

With all the talk about the “AI revolution” and the insistent push by Big Tech to put generative AI into just about anything⁴⁴, it is necessary to remember that what makes generative AI work is human labor and natural resources.

The goal of this text isn’t to convince you to stop using Chat-GPT. The goal is to urge you to think twice when you’re told that AI will solve climate change, or take over everyone’s jobs, or any of the other grandiose claims that Big Tech likes to make and the media likes to parrot.

Generative AI, no matter how much Sam Altman* would like you to believe it, does not stand on its own. It comes at a high environmental cost and it works thanks to millions of people mining raw materials, assembling hardware, labelling data and only then, developing software. It isn’t free and it isn’t immaterial.

This doesn’t mean there is no use for it. It means that we should know what the price of it is before we accept its application.

When generative AI is incorporated into security, energy, education, the workplace or the government, it becomes a part of infrastructure and threatens to be “locked in”. Carbon lock-in is a term used to explain how once dependencies on carbon-intensive infrastructures are made, they are almost impossible to break because too much of society comes to rely on them.⁴⁵ Likewise, pushing AI adoption in all sectors of society will result in dependencies and interdependencies from which it will be hard to escape.

So, since backtracking is almost impossible, asking “Is this necessary?” matters. And knowing what we are giving up in exchange is the first step in answering it.

Notes:

*here is an example of how it all comes together

*The CEO of OpenAI, which created ChatGPT

OTHER:

Get your own zine!

Do you have feedback, questions or comments?

Sources

Zmijica Zines