Taming the Lion of the LLM

The paper “Miss Tammy as a Use Case for Moral Prompt Engineering” by Myriam Rellstab and Oliver Bendel from the FHNW School of Business was accepted at the AAAI 2025 Spring Symposium “Human-Compatible AI for Well-being: Harnessing Potential of GenAI for AI-Powered Science”. It describes the development of a chatbot that can be available to pupils and de-escalate their conflicts or promote constructive dialogues among them. Prompt engineering – called moral prompt engineering in the project – and retrieval-augmented generation (RAG) were used. The centerpiece is a collection of netiquettes. On the one hand, these control the behavior of the chatbot – on the other hand, they allow it to evaluate the behavior of the students and make suggestions to them. Miss Tammy was compared with a non-adapted standard model (GPT-4o) and performed better than it in tests with 14- to 16-year-old pupils. The project applied the discipline of machine ethics, in which Oliver Bendel has been researching for many years, to large language models, using the netiquettes as a simple and practical approach. The eight AAAI Spring Symposia will not be held at Stanford University this time, but at the San Francisco Airport Marriott Waterfront, Burlingame, from March 31 to April 2, 2025. It is a conference rich in tradition, where innovative and experimental approaches are particularly in demand.

Towards Moral Prompt Engineering

Machine ethics, which was often dismissed as a curiosity ten years ago, is now part of everyday business. It is required, for example, when so-called guardrails are used in language models or chatbots, via alignment in the form of fine-tuning or via prompt engineering. When you create GPTs, i.e. “custom versions of ChatGPT”, as Open AI calls them, you have the “Instructions” field available for prompt engineering. Here, the “prompteur” or “prompreuse” can create certain specifications and restrictions for the chatbot. This can include references to documents that have been uploaded. This is exactly what Myriam Rellstab is currently doing at the FHNW School of Business as part of her final thesis “Moral Prompt Engineering”, the interim results of which she presented on May 28, 2024. As a “prompteuse”, she tames GPT-4o with the help of her instructions and – as suggested by the initiator of the project, Prof. Dr. Oliver Bendel – with the help of netiquettes that she has collected and made available to the chatbot. The chatbot is tamed, the tiger becomes a house cat that can be used without danger in the classroom, for example. With GPT-4o, guardrails have already been introduced beforehand. These were programmed in or obtained via reinforcement learning from human feedback. So, strictly speaking, you turn a tamed tiger into a house cat. This is different with certain open source language models. The wild animal must first be captured and then tamed. And even then it can seriously injure you. But even with GPTs there are pitfalls, and as we know, house tigers can hiss and scratch. The results of the project will be available in August. Moral prompt engineering had already been applied to Data, a chatbot for the Data Science course at the FHNW School of Engineering (Image: Ideogram).