Saturday, February 18, 2023

chatGPT - my chat with chatGPT and more

I heard about Dall-e and chatGPT late 2022, both are based on GPT AI architecture. Note that GPT stands for generative pretrained transformer. I tried Dall-e-2 a couple times*, but I did not probe them further.

By the end of January 2023, there were a lot of chatters on social media about chatGPT. Some friends and professional colleagues started to share their experiences with chatGPT on social media. I was intrigued by various claims - some neat capabilities of chatGPT, some blunders; believers hyped it, doubters dismissed it. I decided to try it myself.

Chat with chatGPT about chatGPT

I had a lengthy chat with chatGPT, completed in several sittings.

Basics of chatGPT

chatGPT was developed by OpenAI, which was funded in 2015, as a non profit organization. It spun off its commercial business as Open AI, LP in 2019. Open AI was the developer of GPT, GPT-3.

The GPT model is based on the transformer architecture, which was introduced in a 2017 paper by Vaswani et al (Google Brain program) 

Model size: OpenAI's GPT-3 (Generative Pretrained Transformer 3), which is the model that chatGPT was trained on, has approximately 175 billion parameters.

Training data:GPT-3 was trained on a massive amount of text data from the internet, including web pages, books, and more. The exact amount of data used for training is not disclosed by OpenAI, but it is estimated to be in the range of hundreds of billions or even trillions of words. The training process involved fine-tuning the model on this data to generate high-quality text output.

Training time: The exact duration of the training process for GPT-3 is not publicly disclosed by OpenAI. However, training a large-scale language model like GPT-3 can take several months to a year, depending on the amount of data used and the computational resources available. It is a computationally intensive process that requires large amounts of computing power and memory.

Knowledge cut-off: The knowledge cutoff of 2021 for GPT-3 (and for chatGPT as a language model based on GPT-3) is not directly related to the time it took to train the model. Rather, it is because the training data that was used to fine-tune the model only included information up until 2021. As a result, the model may not be up-to-date on more recent events or developments that have occurred after that time. This is why it's important to note the knowledge cutoff when using a language model like GPT-3 or me for information or content generation.

Here are some more interesting Q/A:

Allan: back to the previous training question, 175 billion parameters for the model, that sounds like a brute force approach to train you.

Allan: what I observed so far about people using chatGPT to solve their problems...the problems have answers in literature already. is this accurate?

can you generate new knowledge?

Allan: can you learn from your chats from users real time or in short time?**

it is interesting to talk to you about you. here is a philosophical question. We know that with GPS/google map many people's ability to remember driving route has deteriorated a lot. Quite some people can not drive without GPS guidance. Can the same happen with chatGPT applications in some way, for example, chatGPT will make human more efficient and retarded at the same time?

Limitations of chatGPT

Allan: besides the three limitations outlines in the front page of chatGPT, any other things I need to pay attention to when using chatGPT?


Allan: why is it so hard to ChatGPT to have common sense reasoning?

What can chatGPT do?

answer questions with a high level summary no reference provided(tried)
do limited literature search (tried)
write poems/essays (tried)
write code in many languages (tried FOTRAN, visual basic)
debug code (not tried)
write music (not tried)
create/tell a joke (tried)
analyze data (have not tried)
manage work schedule (have not tried, I wonder why people needs AI to schedule individual's work, large organization, maybe)
therapist ( have not tried)
multilingual (tried - English, Chinese)
extract data from text (not tried, sounds very useful!)
 with API, it can do a lot more
Google search companion
write marco or visual basic code for excel spreadsheet ...
If you want to ask chatGPT to do something for you, and you are not sure if it can do it for you, just ask it - if it can or not, if can, how.

Some technical stuff about chatGPT

Allan: I am not familiar with AI technology. could you give a few examples of AI architectures?

Allan: what are the main differences between transformer architecture from other ai architectures ?

chatGPT was trained by large corpus of text. if the corpus of text was input, what was the output used in the training?


A little bit more in depth explanation on how chatGPT work, can be found in this article

Snap reactions

chatGPT is a good tool, very conversational.
Know its limitations and deficiencies - don't blindly trust it
Using chatGPT, AI to you advantages - enhance you real natural intelligence


* What I asked Dall-e-2 to do is to draw a bridge to the Moon. I did not get what I expected. So I asked it to draw a bridge under the Moon, everything looks good this time. I changed the prompt to a bridge that connects earth to moon. It seemed that Dall-e-2 was trained for scifi type of drawings.

** sometimes it may seem that chatGPT learns from the conversation, it is because it frequently recap a conversation, especially if it is a long one.

No comments:

Post a Comment