Does AI only repeat what it has learned?
Artificial intelligence is often criticized with claims that it can only repeat its training data, and therefore always produces plagiarized and average output. Is there any truth to these claims? Claim 1: AI retrieves its answers from a database I’ve encountered this claim often. The idea is that AI retrieves answers from its database, and thus it plagiarizes or fails to find the correct answer. Large language models and image-generating AI models do not, by default, have access to any kind of database. Instead, these models have learned to generate responses independently. The image or poem produced by AI, for example, does not exist as-is in any database. Large language models don’t use databases, but they can be connected to one Today, large language models can indeed be connected to a database. Currently, the most common method for doing this is the so-called RAG model (Retrieval Augmented Generation). In this setup, the AI can retrieve information from a database to support its answer. However, the AI still writes the response itself. Claim 2: AI only produces average answers This claim is more complex, as there are many types of generative AI models. Images are often produced using diffusion models, which begin with a random mess of pixels and gradually transform that noise into a better image. The AI aims to reach some sort of average optimal output, so its tendency is toward the mean. Diffusion models run iteratively – each iteration creates a better image, one that’s also closer to the average. Somewhere between the initial noise and the average optimal lies an iteration where the AI produces good images that haven’t yet converged into uniform, average-looking ones. These images are by no means simply average, even if they inevitably share something with the optimal average. With an update, Adobe Firefly began producing better, though very similar, images What about large language models? They also aim to produce the best possible answer, which often results in an average-like response depending on the prompt. However, large language models have a feature that allows you to adjust the temperature, which influences how average or creative the responses are. At the extremes, adjusting the temperature can make the model generate either extremely bland text or pure nonsensical gibberish. Emergent intelligence The intelligence of large language models is emergent. They can generalize from what they’ve learned to completely new tasks. This simply means that AI models can generate responses to questions they’ve never encountered in their training data. These responses are not merely average repetitions of what’s already been learned, as the AI cannot just mimic its data like a parrot would. Adobe Firefly’s training data guides it so heavily that it cannot generate a wine glass filled to the brim Image-generating AI models do not show the same level of emergent intelligence, as their training data influences their output more heavily than with text models. It can often be nearly impossible to get certain kinds of images from them. Average or not? The claim that AI only produces average responses oversimplifies things. Training data influences AI more or less depending on the model, but that doesn’t mean AI is only capable of producing dull, obvious answers. AI also doesn’t just repeat what it has learned, since it’s trained to provide responses to problems it has never encountered before.
Does artificial intelligence only look into the past?
Lately, an interesting argument has come to my attention: ChatGPT only looks into the past, whereas humans can look into the future. This idea stems from the fact that AI is trained on past data, and for instance, ChatGPT's knowledge of the world is limited to the last date of its training material. However, this does not directly mean that AI is only looking at the past. Machine Learning Always Faces the Unknown The fundamental principle of machine learning has always been to train AI on past data and test it on new, unseen data. This ensures that AI functions as expected even when encountering entirely new information. Machine learning aims to work with new data Before large language models, language technology-based machine learning models often struggled when faced with completely new types of data. For example, AI trained on product reviews did not perform well in identifying positive and negative expressions in literary texts. However, these limitations have been overcome with large language models, as they can generalize their learning to perform multiple different tasks. Do Humans Really Look into the Future Any Better? When we humans encounter something new, we often rely on past knowledge to act. Our own "training data" also ends at the present moment. If we see an unfamiliar furry creature on a leash walking towards us, we logically assume it is a dog. This assumption is based on previous knowledge. If it turns out to be a completely new and unknown animal species, we are surprised by the encounter. Similarly, AI relies on existing knowledge when encountering new things. The difference is that, at present, we do not have AI tools capable of dynamically learning from their experiences and updating themselves. AI will always assume that the furry creature is a dog until its training data includes information that a new pet-friendly species has been discovered. A human, however, would learn this instantly. Foreseeing the Future is Reasoning Just as humans predict the future using reasoning and scenario planning, AI can also predict the future by drawing logical conclusions. Large language models are already capable of reasoning and performing tasks that require thought. AI can therefore look into the future if properly guided with prompts to make predictions. Many AI tools, such as ChatGPT and Perplexity, can also fetch additional information from the web, allowing them to base their reasoning on up-to-date data.
Can AI Be Used to Forecast Change with MLPESTEL?
Dr Khalid Alnajjar and Dr Mika Hämäläinen explored in their MBA thesis the capability of artificial intelligence (AI) to forecast change in the operational environment of companies. For this task, they employed a large language model (LLM) and developed a new theoretical framework called MLPESTEL. The Paradigm Shift that Made Forecasting Possible Traditionally, machine learning (ML) techniques have relied on learning patterns form data for individual tasks. Therefore, such models have been able to formulate predictions only in a very limited application area such as weather forecasting or financial forecasting. However, the dawn of LLMs made it possible for AI to conduct reasoning in domains outside of narrow topics and on textual data instead of numerical data. A Call for a New Framework Although LLMs such as ChatGPT have incredible capabilities in terms of reasoning and answering a variety of prompts, they cannot tackle such a difficult problem as forecasting change by a mere prompt. LLMs can reason, but they need to be given the tools to do so - just like us humans. Furthermore, such a complex task must be split into smaller subproblems. The MLPESTEL framework by Alnajjar & Hämäläinen (2024) The researchers elaborated a new framework called MLPESTEL, which draws its inspiration from PESTEL, a framework traditionally used in business research, and the Ecological Systems Theory, a framework commonly used to understand social development of a child. The former framework is important for the business application area of the research whereas the latter was used to divide each individual PESTEL category into four different subsystems – micro, exo, meso and macro systems. The resulting framework was quite complex for a person to conduct analysis with, but not at all too demanding for an LLM which can easily operate on such a level of complexity. Early Results on AI-based Forecasting The researchers investigated the viability of their method by studying the predictive capabilities of an LLM using the MLPESTEL framework on two international companies: Nokia and Tesla. The method was able to correctly predict the opportunity 5G technology brought to Nokia and the difficulties of global chip shortage that impacted Tesla. The results obtained in the thesis work are promising and serve as a proof of concept. LLMs have reached such a maturity level that they can be used in forecasting tasks. MLPESTEL has extended the theoretical capability of conducting forecasting in the context of operational business environment. This research has paved the road for future studies on LLM-driven forecasting and futures studies. The findings serve as a stepping stone for a more comprehensive platform to be developed at Metropolia University of Applied Sciences.