Artificially Intelligent: The Reality of AI's Limitations in Data Analysis

20.07.23 01:54 PM

Digital HUB is an open online community of financial and data science professionals pursuing practical applications of AI in their everyday functions.  Digital HUB community provides expert, curated insights into financial applications of Generative AI, Large Language Models, Machine Learning, Data Science, Crypto Assets and Blockchain.

A key focus for The Digital HUB publication is to provide best practices for the safe deployment of AI at scale such as: assessing the ability to execute, determining an organization’s digital DNA, fostering skill development, and encouraging responsible AI.

ChatGPT by Lucas Bunting

Artificially Intelligent: The Reality of AI's Limitations in Data Analysis OpenAI’s newest model, GPT-4, promises a utopian future for data scientists where extracting insight from data is as simple as asking the right question. And judging by the speed at which these models are improving, I expect that day will come sooner than most realize. Unfortunately, GPT-4 is not a panacea for data analysis in its current state. While it is particularly astute at generating code to answer a question, it may also @March 17, 2023 Artificially Intelligent: The Reality of AI's Limitations in Data Analysis 2 confidently provide you with an incorrect numeric answer for the same question. The only remedy is to check its work and verify the response, and at that point it may have been quicker to not use the tool at all. The point of this article is not to dissuade anyone from using ChatGPT for data analysis; quite the opposite. The goal is to create some dialogue in this community around the practical use cases and limitations of these AI models today and to encourage hands-on experience with AI technology in order to become a more efficient analyst who works smarter, not harder. In an attempt to test the current state of GPT-4’s usefulness in data analysis, I provided the AI with a small data set of monthly log returns for the S&P 500, 400 and 600 over the last 3 years. I then gave it a series of tasks common in the data scientist’s workflow, including cleaning data, generating descriptive statistics, and creating a forecast of one of the variables. If the AI could perform these tasks well, it would significantly ease the burden of conducting extensive exploratory data analysis. A logical first step would be to clean the data set. To test ChatGPT, I made one observation NA and one an outlier ( 900 ). Let’s see if it can catch it. (Note: Data sets be pasted into the chat box and the AI can infer the rows and columns.) Prompt: Here is a data set of stock index log returns. Can you clean this data for me? date ^GSPC ^SP400 ^SP600 2020-01-31 -0.001629419602 -0.02742027258 -0.04132477276 (…) Response: Artificially Intelligent: The Reality of AI's Limitations in Data Analysis 3 This is quite impressive. It identified the date field and suggested to turn it into a datetime type, it recommended a way to fix the NA , and it found the extreme value ( 900 ). What’s impressive (but impractical) is that it decided to return an example of clean data set written out in a table like the one above. It would not be a stretch to expect the model to return a .csv with the cleaned data in the future, but for now instructions on how to clean will be sufficient. A natural next step would be to calculate some descriptive statistics. I’m going to load in a clean data set and ask it to do that for me. Prompt: Here is a new data set of stock index returns. Please provide me with some descriptive statistics and a correlation matrix. date ^GSPC ^SP400 ^SP600 2020-01-31 -0.001629419602 -0.02742027258 -0.04132477276 (…) Response: Artificially Intelligent: The Reality of AI's Limitations in Data Analysis 4 Here we have our first example of the AI confidently giving the wrong answer. It returned the correct Min and Max for all but the Min for ^SP600 . The Mean and Standard Deviation , however, were wrong. And, while directionally correct, it also managed to get the correlation matrix wrong. Here are the expected values: Artificially Intelligent: The Reality of AI's Limitations in Data Analysis 5 You can undoubtedly see the potential, despite these answers not being correct. The model gets close, and future iterations of these models may include features that can execute better mathematically, but for now, the best practice is to ask it for code that generates an answer to the question instead of asking the AI directly for the answer. Although we’ve seen instability with easier tasks, let’s throw a modeling question at it. The data set we’re using is a time series, so let’s see if it can use an AR(1) model to predict the next value of the S&P 500. Prompt: Using the data set I gave you, can you fit an AR(1) model and give me the prediction for the next value of ^GSPC? Response: It’s important to note that this is generally good code, but it will throw an error. disp is not an argument of model.fit() and statsmodels has updated its packages so that the proper import would be from statsmodels.tsa.arima.model Artificially Intelligent: The Reality of AI's Limitations in Data Analysis 6 import ARIMA but once those two issues are rectified, the code generates a Jan 31, 2023 prediction of 0.012863 . If we then ask it to compute it mathematically it walks through the steps (which is neat!) and even explains the assumptions. It references the data set correctly, and even shows its work. The way it estimates alpha and phi , however, is different from the python implementation it generated, thus resulting in an incorrect forecast: Artificially Intelligent: The Reality of AI's Limitations in Data Analysis 7 Incorrect numerical answer aside, the information it provides is quite useful. It shows how and AR(1) model is formulated, how the parameters are estimated, and provides an example of how the math would be done, should you have proper parameter estimates. The steps are right, it just confidently executes them incorrectly! It is truly impressive how far OpenAI’s ChatGPT has come in such a short time. The possibilities with this technology cannot easily be overstated. It is a natural symptom of the hype cycle to get lost in all the promises implicitly made by new technology, but as people of science and data we must ensure we approach AI in an intellectually honest way. These tools can be useful in several parts of the EDA process as exemplified in this article. It’s particularly good at suggesting paths forward in EDA, generating relevant code, and explaining mathematical methods and the accompanying assumptions in plain English. The responses GPT-4 generates from mathematical or statistical data analysis, however, are fallible and require significant human oversight. Left unchecked, these issues can create adverse downstream effects. It can be as harmless as ~0.05 difference in the correlation matrix for a personal project, or it can be immensely consequential, like a judge using it for a ruling, thus dictating the fate of the defendant.(1) As we see AI models get integrated into our systems and products, the real winners, I propose, will be humans with a curiosity for the technology, an uncommon amount of common sense, and a healthy skepticism of results who can take advantage of the tool without becoming dependent on it. (1) https://www.cbsnews.com/news/colombian-judge-uses-chatgpt-in-ruling-onchilds-medical-rights-case/

mehrzad mahdavi