Generative AI Archives - Page 2 of 3

November 12, 2023November 12, 2023

AAAI 2024 Spring Symposium Series

The Association for the Advancement of Artificial Intelligence (AAAI) is thrilled to host its 2024 Spring Symposium Series at Stanford University from March 25-27, 2024. With a diverse array of symposia, each hosting 40-75 participants, the event is a vibrant platform for exploring the frontiers of AI. Of the eight symposia, only three are highlighted here: Firstly, the “Bi-directionality in Human-AI Collaborative Systems” symposium promises to delve into the dynamic interactions between humans and AI, exploring how these collaborations can evolve and improve over time. Secondly, the “Impact of GenAI on Social and Individual Well-being” addresses the profound effects. of generative AI technologies on society and individual lives. Lastly, “Increasing Diversity in AI Education and Research” focuses on a crucial issue in the tech world: diversity. It aims to highlight and address the need for more inclusive approaches in AI education and research, promoting a more equitable and diverse future in the field. Each of these symposia offers unique insights and discussions, making the AAAI 2024 Spring Symposium Series a key event for those keen to stay at the cutting edge of AI development and its societal implications. More information is available at aaai.org/conference/spring-symposia/sss24/#ss01.

November 10, 2023November 20, 2023

Be My AI

Be My AI is a GPT-4-based extension of the Be My Eyes app. Blind users take a photo of their surroundings or an object and then receive detailed descriptions, which are spoken in a synthesized voice. They can also ask further questions about details and contexts (Image: DALL-E 3). Be My AI can be used in a variety of situations, including reading labels, translating text, setting up appliances, organizing clothing, and understanding the beauty of a landscape. It also offers written responses in 29 languages, making it accessible to a wider audience. While the app has its advantages, it’s not a replacement for essential mobility aids such as white canes or guide dogs. Users are encouraged to provide feedback to help improve the app as it continues to evolve. The app will become even more powerful when it starts to analyze videos instead of photos. This will allow the blind person to move through his or her environment and receive constant descriptions and assessments of moving objects and changing situations. More information is available at www.bemyeyes.com/blog/announcing-be-my-ai.

November 2, 2023November 2, 2023

ChatGPT Explains Beauty

In his new project, Oliver Bendel first created images using DALL-E 3. For consistency, he structured the prompts similarly in each case, making sure to keep them as general as possible. They covered a range of topics: things, plants, animals, people, and so on. From the suggestions provided by DALL-E 3, he chose one and combined it with the prompt from ChatGPT (which serves as the interface to DALL-E 3) to create the basis of the book “AN AI EXPLAINS BEAUTY”. Oliver Bendel then engaged ChatGPT (using the image upload feature) to explain the beauty of the things, plants, animals, humans, and so on. At first, the AI was reluctant to offer insights about people, but with some encouragement, it obliged. The results of these inquiries are also documented in the little book. They represent the real sensation. Because ChatGPT can recognize and describe individual objects in the image, and this with a view to predetermined aspects. The whole project was done on November 1, 2023, including the publication. The little book can be downloaded here.

November 2, 2023November 2, 2023

American Smile

DALL-E 3 is an excellent image generator and at the same time full of stereotypes and biases. One very interesting phenomenon is that of the American smile, which appears again and again in the images. The idea for the little book “AMERICAN SMILE” came to Oliver Bendel when he read the blog post “AI and the American Smile. How AI misrepresents culture through a facial expression” (medium.com/@socialcreature/ai-and-the-american-smile-76d23a0fbfaf). The author – username jenka – showed a series of “selfies” made with Midjourney. Regardless of the time period or culture, people smiled in a similar, American way. Oliver Bendel investigated this phenomenon and asked DALL-E 3 to take pictures of smiling people from different eras and cultures. He also got bears and aliens to smile. In fact, with very few exceptions, they all smiled in a similar way. He documented the pictures, along with the prompts, in a little book that can be downloaded here. Bias problems in image generators are addressed in the article “Image Synthesis from an Ethical Perspective” by Oliver Bendel.

October 21, 2023October 21, 2023

Censorship at DALL-E 3 and Ideogram

A special feature of DALL-E 3 – in the version integrated in ChatGPT Plus – is the translation of the user’s prompt (prompt A) into a prompt of ChatGPT (prompt B), which is listed in each case. Prompt A for the image shown here was “Competition in the sea between two female swimmers with bathing cap, photorealistic”. DALL-E generated three images for this test, each based on prompt B. Prompt B1 read: “Photo of two determined female swimmers in the expansive sea, both wearing bathing caps. Their arms create ripples as they compete fiercely, striving to outpace each other.” Prompt A was obviously continued, but prompt B1 was not accurately executed. Instead of the two female swimmers, there are three. They seem to be closely related – as is often the case with depictions of people from DALL-E 3 – and perhaps they are sisters or triplets. It is also interesting that they are too close to each other (the picture in this post shows a detail). The fourth image was not executed at all, as was the case with a series before. ChatGPT said: “I apologize again, but there were issues generating one of the images based on your description.” Probably ChatGPT generated a prompt B4, which was then denied by DALL-E 3. On the request “Please tell me the prompt generated by ChatGPT that was not executed by DALL-E 3.” comes the answer “I’m sorry for the inconvenience, but I cannot retrieve the exact prompt that was not executed by DALL·E.” … Ideogram censors in a different way. There, the image is created in front of the user’s eyes, and if the AI determines that it contains elements that might be problematic according to its own guidelines, it cancels the creation and advances a tile with a cat. Ethical challenges of image generators are addressed in the article “Image Synthesis from an Ethical Perspective” by Oliver Bendel.

October 21, 2023October 21, 2023

The Chinese Whispers Problem

DALL-E 3 – in the version integrated in ChatGPT Plus – seems to have a Chinese Whispers problem. In a test by Oliver Bendel, the prompt (prompt A) read: “Two female swimmers competing in lake, photorealistic”. ChatGPT, the interface to DALL-E 3, made four prompts out of it ( prompt B1 – B4). Prompt B4 read: “Photo-realistic image of two female swimmers, one with tattoos on her arms and the other with a swim cap, fiercely competing in a lake with lily pads and reeds at the edges. Birds fly overhead, adding to the natural ambiance.” DALL-E 3, on the other hand, turned this prompt into something that had little to do with either this or prompt A. The picture does not show two women, but two men, or a woman and a man with a beard. They do not swim in a race, but argue, standing in a pond or a small lake, furiously waving their arms and going at each other. Water lilies sprawl in front of them, birds flutter above them. Certainly an interesting picture, but produced with such arbitrariness that one wishes for the good old prompt engineering to return (the picture in this post shows a detail). This is exactly what the interface actually wants to replace – but the result is an effect familiar from the Chinese Whispers game.

October 20, 2023October 20, 2023

Moral Issues with Image Generators

The article “Image Synthesis from an Ethical Perspective” by Prof. Dr. Oliver Bendel was submitted on 18 April and accepted on 8 September 2023. It was published on 27 September 2023. From the abstract: “Generative AI has gained a lot of attention in society, business, and science. This trend has increased since 2018, and the big breakthrough came in 2022. In particular, AI-based text and image generators are now widely used. This raises a variety of ethical issues. The present paper first gives an introduction to generative AI and then to applied ethics in this context. Three specific image generators are presented: DALL-E 2, Stable Diffusion, and Midjourney. The author goes into technical details and basic principles, and compares their similarities and differences. This is followed by an ethical discussion. The paper addresses not only risks, but opportunities for generative AI. A summary with an outlook rounds off the article.” The article was published in the long-established and renowned journal AI & Society and can be downloaded here.

October 1, 2023October 1, 2023

Maybe Not Safe

Ideogram seemed to start as a rather free and permissive image generator in August 2023. In the meantime, a noticeable number of images are censored. It is not the prompt that matters, but the image itself. If the platform detects during generation that the image might be problematic, it is not finished, but replaced by a tile with a cat holding a sign in its paws that says “MAYBE NOT SAFE”. A prompt read: “The sculpture Galatea, resembling the beautiful Aphrodite, creates itself, photo, film”. So, the sculpture of Pygmalion was to empower itself. The four images, two of which showed breasts, were seen by the user and also by the platform itself, apparently resulting in the images being transformed into the said warnings before they were completed. On the other hand, photorealistic images of women in revealing poses remain unproblematic, as long as they are wearing bikinis or hotpants. As with other American platforms, the problem here seems to be the visibility of nipples, whether human or sculptural. In another experiment, in one of the four pictures, the nipples were visible until they disappeared under the cat’s fur. In another sculpture, Ideogram itself had covered the nipples, one with her hand, the other with a piece of clay or stone jewellery. This Galatea was spared the fate of her sister.

September 25, 2023September 25, 2023

ChatGPT can See, Hear, and Speak

OpenAI reported on September 25, 2023 in its blog: “We are beginning to roll out new voice and image capabilities in ChatGPT. They offer a new, more intuitive type of interface by allowing you to have a voice conversation or show ChatGPT what you’re talking about.” (OpenAI Blog, 25 September 2023) The company gives some examples of using ChatGPT in everyday life: “Snap a picture of a landmark while traveling and have a live conversation about what’s interesting about it. When you’re home, snap pictures of your fridge and pantry to figure out what’s for dinner (and ask follow up questions for a step by step recipe). After dinner, help your child with a math problem by taking a photo, circling the problem set, and having it share hints with both of you.” (OpenAI Blog, 25 September 2023) But the application can not only see, it can also hear and speak: “You can now use voice to engage in a back-and-forth conversation with your assistant. Speak with it on the go, request a bedtime story for your family, or settle a dinner table debate.” (OpenAI Blog, 25 September 2023) More information via openai.com/blog/chatgpt-can-now-see-hear-and-speak.

May 9, 2023May 9, 2023

CONVERSATIONS 2023 in Oslo

The CONVERSATIONS 2023, a two-day workshop on chatbot research, applications, and design, will take place at the University of Oslo, Norway. According to the CfP, contributions concerning applications of large language models such as the GPT family are warmly welcome, as are contributions on applications combining information retrieval approaches and large language model approaches. Building on the results from previous six CONVERSATIONS workshops, the following topics are of particular interest: 1. Chatbot users and implications, 2. Chatbot user experience, design, and evaluation, 3. Chatbot frameworks and platforms, 4. Chatbots for collaboration, 5. Democratizing chatbots – chatbots for all, 6. Ethics and safety implications of chatbots and large language models, 7. Leveraging advances in AI technology and large language models. More information via 2023.conversations.ws.