ChatGPT from OpenAI, Gemini from Google, Llama from Meta and Claude from Anthropic, Large Language Models (LLMs) are currently a much discussed subject.
The new services that these models providing are overwhelming.
If you’ve ever wondered why ChatGPT 3 communicates so politely and courteously with you, it’s because experts from various disciplines have answered hundreds of thousands of real questions in the second training step and the Large Language Model was then trained with them.
The LLMs are always based on two training steps. In the first step, ChatGPT 3 processed 320 terabytes of publicly accessible documents. The texts were divided into so-called tokens, which basically correspond to a word. Each token is then assigned around 20,000 attributes. Calculating these attributes is the main computing task that has to be performed in the first step. To do this, the world’s largest computers spend several weeks calculating the 320 terabytes of data. This gives the model the ability to string individual words together like a human being and to form the answer depending on the words it is given. The system then uses large tables to calculate which word is most likely to come next.
In the second training step, the LLMs are then trained with specialized knowledge that covers specific areas or answers all kinds of questions like a friendly assistant.
OpenAI (ChatGPT) is facing a deficit of around 7 billion US dollars this year, but anyone who saw this year’s Microsoft keynote knows that Microsoft wants to bring AI into all areas of people’s lives and earn gigantic sums of money with it.
For those who will use AI, from ordinary people to larger organizations or universities, there are a few essential problems:
1. what the system is trained on and with what emphasis, we have no control over. The regional or cultural focus is determined by the companies that establish the system. (For example, in ChatGPT 3, content from the Reddit platform was overrated 23 times because the content was classified as particularly relevant).
2. in the future, a gigantic increase in the basic data used for the first training level and an increase in the number of attributes is expected. (From the last public version (“OpenAI”) to ChatGPT 3 there was an increase from approx. 760 attributes to over 19,000). As a consequence, this means a gigantic increase in the required computing power, which nobody except the few digital giants can afford.
If we don’t want to become the paying slaves of huge corporations, we have to build our own systems. As countries, as economic units (e.g. EU or African Union) as an association of universities or as a global association of scientists or non-profit organizations.
Thus the open source idea for digital systems, i.e. the open sharing of software and digital tools, where everyone can see exactly what each individual program line looks like is essential for the future.
On July 6th there will be an online event here on DIGI-FACE that will focus on exactly this topic.
Register and take part!
It will take 2-3 hours and is FREE of charge!
You must be logged in to rate posts.