Having a discussion with AI and the importance of direct experience

ChatGPT is a good but naïve research assistant. It can search for all sorts of things and can catch the essence of huge amounts of material. But there are key things you know that it doesn’t. So there are things that you need to check and ask it to do again. Knowing where and when to do this is crucial.

In a recent publication I interrogate the naïve and not immediately apparent ideological underpinnings of an ethnographic description it did for one of my students.

When I saw an invitation from it to produce images, concerned about the current proliferation of AI-generated images, I decided to see what it could do. I asked it to make a picture of an Iranian village with a mosque in a beautiful valley landscape. I asked this because I already have one, which I’ve used for my website background image – at the top of this page.

My picture comes from Beny & Nasr’s 1975 Persia: bridge of turquoise. I haven’t been there; but it resonates inspiringly with places in Iran I have been to. Also, ChatGPT was able accurately to tell me the geographic location – at least as described in the book.

However, what it quickly produced was this:

I can see it’s AI-generated because it’s like other images that these days appear on Facebook reels. I don’t have the literacy to describe the style; but I can recognise it.

When I asked ChatGPT where it got the image from, it told me it was:

an illustrative, AI-created scene, inspired by typical Iranian landscapes and architecture, not a photograph of an actual village or mosque.

But when I accused it of producing something that was ‘quite dangerous because it gives people who’ve never been to such a location a very superficial impression’, it came back with an interesting critique – using the phrases:

Epistemically dangerous …. flatten reality into a cliché … reinforce exoticism … a subtle form of othering that freezes places outside the West in an eternal postcard state.

It then offered to

deconstruct this image point by point against … design prompts that resist exoticisation, or discuss when AI imagery should not be used at all.

Despite realising that its apparently impressive list of critical phrases was also an algorithmic response to my question, I asked why it therefore created such images when it knows all the limitations. It replied with a detailed explanation (its emphasis):

I generate images …, even though I ‘know’ the limitations, because the system prioritises usefulness and accessibility over epistemic caution. … The model can’t refuse on philosophical grounds. … Misrepresentation-by-simplification isn’t currently treated as a hard safety boundary. That said, … the system should probably surface epistemic warnings more often, not just produce outputs …. Right now, my default behaviour is: Comply first, Explain later if challenged.

When I said that I’d expected it to search actual images to find something more authentic, it replied that that could be done, but not with the current interface. But it then it suggested how we could in the future work together using the following principles (its emphasis):

Work only with real, verifiable photographs and discuss them critically. Analyse what AI images systematically get wrong about Iranian (or any) landscapes. … Design a clear, ethical workflow.

I then started a new chat, with no memory of the first one, and asked for a picture again, this time using its own phrases from the first chat about what it should not do.

The result was very little different to the first picture. The same actual images, just a little less colourful with the mosque and the village smaller and the landscape larger.

I could have spent a lot of time asking ChatGPT to modify the picture with successive instructions. I tried this for a while. But I felt that I would never get to an image that represented what I had experienced in real life.

I know that my own experience is also mediated by an ideological gaze; but at least I’m working with my own prejudices, which I also have direct experience of, instead of with those of an AI device that is a lot of work to control and can only get so far.

This makes me think that, in social research and in intercultural encounters, working as directly as possible with personal experience is paramount.

Take, for example, an interview. We have to have been there. To have noticed all the things going on in and around the event. If we transcribe, with all the codes and conventions designed to minimise subjectivity, we must never forget the direct experience of which it is a representation. The same with the whole range of devices, in effect forms of AI, we use to objectify our research. Being systematic is the essence of research, but never at the expense of working with direct experience and all the ideological underpinnings that it carries. We don’t analyse the transcript; we use it to help us analyse what we remember about what actually happened.

Returning to the picture of the village and the valley. Any new images we make must be the result of our developing personal knowledge of all the villages and valleys we have seen and what we can work out about how we have constructed our imagination of them. These are the intercultural threads which make up our social environment, but which we must never stop directly making difficult sense of.

I would not have been able to have this discursive evaluation of ChatGPT without this direct experience.

Share this:

Leave a Reply Cancel reply