I'm not sure if this belongs here or in r/learnmachinelearning, but I have a question about what is and isn't possible.

My husband's best friend passed several years ago and he has copious chats, forum posts as well as the stories they wrote together. Right now, we have created a bot in Character.AI that kinda sounds like him, but obviously not one that contains any of his knowledge since the definition window is small and the bots' memory is... imprecise at best.

So I got to wondering: it seems like it should be possible to fine-tune/create a LoRA of one of the LLMs so that it does contain the friend's knowledge and can be used as a chatbot. From my research it seems like Vicuna would be a good fit as it has already been tweaked to act as a chatbot. I'm currently working through tutorials, including the "How to Fine Tune an LLM" that exists on Colab that tweaks GPT2 (I think) with wikipedia data. I know I have a huge learning curve ahead of me.

I would be looking at doing the training using Google Colab, but ideally he'd run the end result locally. He can run stable diffusion on his machine using his NVidia GPU. Sadly, my video card is AMD so while I can technically run the Vicuna 4 bit model (13B, I think?) in CPU mode, it's too painfully slow to do anything with.

The data is currently unstructured. Obviously we will need to format it properly, but it is in the form of blocks of text rather than the Prompt/Input/Output format I've seen in various Github projects.

As for me, I am a former C# Windows/Web/SQL developer so I'm not starting from absolute scratch, but obviously I'll need to learn a lot. I'm prepared for this to be an ongoing project for the new few months.

I would welcome any feedback as to what is or isn't possible, whether I'm setting my sights too high, or even if I'm simply in the wrong forum. Thanks all!

EDIT: I've received many words of warning about whether this is a good idea, for my husband's sake at least. After thinking about it, I'm not sure I'm at the point where I agree yet but I'll at least give this a lot of thought before attempting something like this. I know it's not the most emotionally healthy thing, to cling to the echoes of someone gone. He has not found interacting with the Character.AI version of his friend to be difficult but while their bots are fun to interact 2i5h and can still sound startingly human, an LLM fine tuned on the friend's text has every chance of being more so to the point of being damaging. So thank you everyone, you've given me a lot to think about.

Comments (51)

I can confirm that training LoRAs on large archives of messages (we are talking several thousand) which are formatted in a stable way will accomplish something comparable to what you are describing. Make sure to split the training segments so that conversations which are unrelated are not directly next to each other in the training data, but instead are separate blocks. I would avoid using any instruct models trained on assistant data such as ShareGPT as it may bias it towards unwanted responses. The larger models are more accurate, and in my experience effectively mimic the style and types of responses of whoever's chat data you trained the model on.

That being said, I think that doing this with anyone who is deceased could be a very emotionally harmful activity. It is one thing to do it as an experiment or to generate funny mock conversations, but using it to try and "replace" someone is something entirely different. If you have enough input data, it may feel like you are talking to the actual person, up to the point where the model makes some kind of mistake which would likely immediately cause you to remember it is an AI and lead to depressing thoughts. It can be convincing, and you can have entire conversations that feel real, but it just isn't the same as the real person it mimics.

So yes, you can do it and it would likely work pretty well. But it may not be a good idea. If you do it, I would say pay very close attention to how interacting with it makes you feel.

Two of you have commented along these lines, so I'll repeat this here. This is for my husband, not me and it was his idea to create the Character.AI chatbot. He knows that it is just a machine, that it's sometimes eerily good at sounding human but it isn't.

It gives him some comfort to talk to something that sounds like his friend. I don't think this is harmful for him, but if it turns out that it is, that he has any bad emotional responses to anything to do with it, then I'll stop working on it.

And in the meantime, it sounds like a fun project.

Yeah I only said this because I have done what you're describing and was incredibly surprised by how well the AI mimics people's responses. Things like ways they speak, common topics, style of messaging, honestly everything. In the chats I trained it on, there were several people and the simulated people captured the real dynamics pretty well. They would talk about and respond to topics that the corresponding people would have responded to in real life. They also held similar political ideas and at one point two of the simulated personalities argued about something with another person in a realistic way. If someone often vented about problems the corresponding model will as well.

The most important thing was to get a consistent message format and make sure that the "blocks" of messages all occurred within 5 minutes of the nearby messages. If too much time passes between messages in the blocks, odds are the topics in the blocks aren't consistent and the AI model will become "distracted" frequently and veer off topic.

What format did you use, and how did you break up the conversation pieces?

I used a python script to break the messages up. The resulting file is a json file with a list of objects with a single data field which contains the blocks of messages. Then with a custom format that just maps the single field, you can train it with text-generation-webui.

I had chatgpt make a json file based on your description, do you think the following format would work?

[
    {
        "data": [
            {
                "timestamp": "2023-05-11T12:00:00Z",
                "user": "Alice",
                "message": "Hey, how's it going?"
            },
            {
                "timestamp": "2023-05-11T12:01:00Z",
                "user": "Bob",
                "message": "Pretty good! Just working on that project due next week."
            },
            {
                "timestamp": "2023-05-11T12:04:00Z",
                "user": "Alice",
                "message": "Oh, right. How's that coming along?"
            }
        ]
    },
    {
        "data": [
            {
                "timestamp": "2023-05-11T12:10:00Z",
                "user": "Bob",
                "message": "I'm having some issues with the calculations."
            },
            {
                "timestamp": "2023-05-11T12:11:00Z",
                "user": "Alice",
                "message": "I can help with that, if you want."
            },
            {
                "timestamp": "2023-05-11T12:13:00Z",
                "user": "Bob",
                "message": "That would be great, thanks!"
            }
        ]
    }
]

This script will reproduce what I did using the dataset above.

https://gist.github.com/iwalton3/b76d052e09b7ddec1ff5e4cc178f5713

Wow, I really appreciate you sharing your script!

I agree with you and your husband's view of it. Science fiction tends to jump into this subject a lot and treat it as an inevitable horror story. But I think that the concern misses out on a real component of mourning.

When we mourn it's both for the loss of a person in our life, and 'for' the loss of their life. People on the outside of that process usually think that the first is the biggest component of grief. But it really isn't. The biggest part of the grief comes from empathy over what the deceased has lost, not those of us mourning them.

Talking to someone or in this case something, 'like' the person we mourn doesn't typically interfere with the mourning process. Sure, there might be the occasional instance where that is the case. Just as there are people who fall in love with a chatbot.

But for the most part going over the things the people we mourn lived for only helps to reaffirm our own dedication to life. To live life for them. Because an active life led where we hold the ideals of the deceased in our hearts while moving forward does maintain that connection we're after. Remembering them as a person who lived rather than defining them by their death. And throughout history that's been done by rereading their old writings or meditating on their life in places they loved. To me it just sounds like this is a continuation of that tradition using new tools.

It gives him some comfort to talk to something that sounds like his friend. I don't think this is harmful for him, but if it turns out that it is, that he has any bad emotional responses to anything to do with it, then I'll stop working on it.

It IS harmful. It'll only either fail, or stir up old memories that are gone. Meeting the echo of a long lost loved one through a machine is the definition of a crutch. It's only going to make the attachment to something gone linger on.

While technically possible, this is a horrible idea that will only reinforce the attachment.

I don't know, to me that just sounds like you're advocating emotional repression. Someone doesn't stop being an important part of your past just because they die. Any more than if they'd just moved away somewhere you can't talk to them anymore. My dad died when I was just a kid and growing up I gained a lot of strength by carrying his memory forward with me. And it helped remind me of how valuable and often far too short life is so that I wouldn't take it for granted.

Do you realise you're seriously saying that it's a good thing to encourage emotion towards a machine because it reminds you of a dead relative?!

That's EXTREMELY unhealthy! It's the defintion of a substitution by a fake. I can't believe anyone would seriously defend this as healthy.

Aren't there movies about exactly this topic? I try not to tell people what to do, but i also think it's a trap. I will keep the person from healing and keep them from moving on with their own life. I wouldn't do it for a prolonged time.

This is some Black Mirror shit right here.

there was a literal episode just like this

https://en.wikipedia.org/wiki/Be_Right_Back

Wait a year or two. This is only getting started.

Ah, sweet, man-made horrors beyond my comprehension

I would wait until mpt becomes more mainstream and make a character card.

When you have more tokens it can remember more memories and you will have a more realistic experience.

Canr say it Will be a perfect Lazarus but it's getting better.

You're getting some vehement pushback, but I think it's a fantastic idea. It is possible! But to an LLM, that's not a lot of data. Is there any way you'd be able to gather more data from his friends and family members who give a fuck about him?

No

For your health

Yep. I would agree with this. OP, trying to "resurrect" a friend with AI doesn't exactly sound mentally healthy. Just look at Replika and the plethora of issues that's caused...

Your husband's friend is gone and has been for years. Let the dead rest.

I do know that this is how the creator of Replika started but the only issues I'm aware of with Replika are how it evolved into an ERP companion and then had that functionality removed. Was there some issue with the creator herself?

I think what he's mainly after is more sort of banter conversations where he can reference a character or story they wrote and not have to feed the AI all the details first. I don't think it's a problem emotionally, but if it is, then I won't do it.

This is a long comment, but please do read on to the end:

Whether or not it begins as something so uninvolved, these topics need to be approached with careful consideration. Everybody advising this not develop further are trying to imply that it will become unhealthy: Grief is a continuing, and . . . confusing process for each and every one of us, and these capable LLMs ā€œplaying pretendā€ are but a confusing mess of stimuli for our brains - ones that subconsciously make connections that delude and lie to us.

We’re only some strangers on the internet (on Reddit of all places), but I hope we’ve given you enough words of warning for a . . . sensible decision. Give your husband a hug, no matter how out of place or weird, understand and tell him that he and the rest of us left behind miss people that passed away, and that we don’t really get over these holes torn into our hearts. Don’t let his friend’s spot be vandalized by some stupid code, ā€˜cause we’re only humans, and we just aren’t made to take this rationally.

Letting these models try and mess with our vulnerable minds are not a risk to be taken. Don’t let them.

Thank you for this. I've just made an edit saying that you've all given me some interesting points of view and while I've not made a proper decision yet, it deserves more than "Hey, I think this is possible, would you be interested?" followed by a "Yes, of course." It's think we should discuss it with his psych before going ahead. She has indicated he needs to make some mental time for his friend in his headspace, but I don't know if this is the right way to go about it.

Thank you.

One thing I'd caution is that given the age range of reddit, it's highly probable that the people arguing against it are getting their ideas of mourning from fiction rather than actual life experience.

I would tend to agree, but this seems subtractive from the discussion. In the end, whatever experiences we’ve actually gone through (or conjured up from some fiction) will only matter if the OP decides to take them into consideration.

We’re only some random people on the internet, but I feel it’s imperative to offer heartfelt thoughts at appropriate times.

Edit: wording.

You’re welcome. Make sure it is something you go through together, not alone. The inner voice really is cruel and deafening without the words of others, especially in distress (which I speak from experience).

We (or at least I) would be happy to offer some thoughts anytime, however much you take away from them.

trying to "resurrect" a friend with AI doesn't exactly sound mentally healthy

Honest question: Why. Can you argue it? I realize it feels that way because it is obviously taboo and such. And one could talk about the moral implications of "would the dead person want that".

Think about a world where the tech just works, to make a point for what is so unhealthy about it. Why would it be healthy to let go and learn to live without the input of a loved one, if you might as well still have it? I think you may be able to argue that, but I also think it's far from being as clear as you make it sound or as one first thinks, just because it is "unnatural" or some rather esotheric concept like that.

[deleted]

I see your point. However, people do all sorts of socially accepted things to keep their memory alive. Not that I would recommend it (for the reasons you mentioned), but is it really so different from putting up a picture? One could say that's basically the same thing, just with older technology.

No? What, specifically is no? The general idea? Whether it's possible? Whether Vicuna is or isn't a good place to start?

This is really cool and I plan to train my own post death ambassador for my family and friends to use. Unfortunately I am clueless on how to start, so I can't help you. We are fortunate that technology has brought us here and your husband is so lucky to have someone like you that supports him. I wonder if such a thing could become more common (and apparently less taboo) in the future.

In the end, you guys decide what is best. I could be wrong here. But I think you need to let people die.

You're so preoccupied with thinking if you could you forgot to stop and wonder if you should..

I think it’s cool. =] Vicuna or Koala are a great place to start.

I’m not sure if you can run Ooba in collab (I assume you can?) but you can run a LoRa on Vicuna through the LoRa tab pretty easily. I have just been dumping in raw books and it works pretty well — I can talk with the characters from their perspective, and.. I think the more text you give it the better.

The cool thing about doing it in Ooba/Lora is it’s really low risk(?) haha. I mean, you can train it in a couple of hours and just see how it turns out! — if you need help getting started just send me a message and I will try to help. =]

100% this is the future.

Being able to talk to grandpa for advice like sci fi.

A trained AI can never be human, Only data masquerading as humans.

Even Turing tried something similar in early 19th century, what you try is possible,

But your health and humanity, please stop.

This is for my husband, not me and it was his idea to create the Character.AI chatbot. He knows that it is just a machine, that it's sometimes eerily good at sounding human but it isn't.

It gives him some comfort to talk to something that sounds like his friend. I don't think this is harmful for him, but if it turns out that it is, then I'll stop working on it.

Technical answer: https://github.com/oobabooga/text-generation-webui has a one click fine-tunning interface, where you can insert a text file and it will fine tune an existing model off-of that.

Off-topic moral one: Everyone griefs in their own way and it is important to respect that. But also, respect the dead.

I’m sorry for your husband but this is fucked. This isn’t healthy. We’re reaching the point we have to decide what this technology should and shouldn’t be used to, and this is something I hope society goes against.

The data is currently unstructured. Obviously we will need to format it properly, but it is in the form of blocks of text rather than the Prompt/Input/Output format I've seen in various Github projects.

One thing you might want to try is using gpt4 to do the formatting for you. If you format a couple json items from unstructured text as an example for it then chatgpt should be able to work through large blocks of text to get it into a format you like.

When working with raw text you might also try to get multiple json items from a single data point. Like one json item for a direct quote, another that explains the reasoning behind why the quote was made.

But personally I think that your basic idea should work pretty well. Basically using training to create a lora trained on the data, and then using general prodding with the initial prompt/character type stuff to continually act as a reminder of the most basic elements such as style of speech or some core elements of worldview.

One thing you might want to try is using gpt4 to do the formatting for you. If you format a couple json items from unstructured text as an example for it then chatgpt should be able to work through large blocks of text to get it into a format you like.

I had thought of getting some AI help. I still think generating all the Prompt/Input parts might be a lot of work. I can't imagine AI is necessarily going to come up with good "prompts" for the supplied text, but I'm prepared to be surprised.

When working with raw text you might also try to get multiple json items from a single data point. Like one json item for a direct quote, another that explains the reasoning behind why the quote was made.

Interesting. I can see why that'd be useful, so the AI knows when it should generate that as a specific response.

I still think generating all the Prompt/Input parts might be a lot of work.

Yep, no matter what I think that's going to be the main problem you're up against. One of the harder parts too is that it's not the easest thing to test. With programming you can usually just get a nice "is this working" binary yes/no pretty easily. This is a lot more fuzzy. I've had results seem so-so with training and then I add in some tangentially related stuff and suddenly it's like some hidden pieces were just perfectly fit together to complete everything. Over time I think you get a feel for it though. I'm not great at this by any means, but I'm at least able to get 'far' better results after just plugging away at it for a while then I did when I was first starting out.

but I'm prepared to be surprised.

It seriously is wild how good gpt4 can be when everything's going as smoothly as possible. Though it can break formatting often enough that frequent validation of the json files is a good idea if you're relying on it.

I forgot to mention too. One nice thing about how the training works is that you're not too tightly bound to a single model. Especially if you're working with a general alpaca format. I still haven't settled on one as a "one perfect model" for everything. And there's always surprises. Like so far one of the best for descriptive weather reports for me? The model trained on the pygmalion dataset. Which I never would have predicted. Vicuna has been one of the best in general for me. But I think it's generally worth playing around with different models to see what works best.

You also might try out llama lora tuner for playing around with some training ideas since it has a google colab notebook. As long as you're doing the training in 8bit you should be able to get it running on kaggle as well, which is a bit more lenient with how long you can run it for. Though one caveat with tuner, the GUI is locked with what I think are overly conservative number limits. But you can just jump into the ui folder pretty easily and raise those caps if needed.

Technically yes but the models you can finetune today ain’t nearly as great at GPT no matter what metrics people show you. This will remain true. If you want your resurrected friend to just talk similar it will work, but he might not really help you out as much. He might make up of things too between you that didn’t exist.

No judgement on what you're doing. My recommendation is to use whatever advice is here for creating the best possible character description, then use SillyTavern AI as a chat front end to openai's chat API. That will allow you to build a lore book of memories the bot can recall and reference. You can also use a local finetuned model as the completion source if you prefer. But it's the lorebook that'll give it a corpus of memories to draw from.

Johnny Silverhand has entered the chat.

This is possible, imperfectly, but it'll be a simulacrum. I don't think that's healthy to do with somebody so close, especially if it wasn't their express wish. If it was, we'll go for it, but it should be the call of the dead guy. If he's not around to ask, don't do it.

I think this is a great project for learning and keeping the memory of someone alive. I’ve done a few fine tunes and the way i formatted the json dataset was was,[{ ā€œinputā€ : ā€œYou: your message.\nFriend: ā€œ , ā€œoutputā€ : ā€œfriends messageā€.}] Then imported it as a pandas df with input and output as columns. Then the prompt would be like this: ā€œā€ā€ā€df[ā€˜input’] df[ā€˜output’]ā€ā€ā€ā€

I'd say you would be disappointment with the result.It's really not resurrecting anything.

We are not in a stage of having Ai yet (despite what twitter tells me). We have something that pretends to write coherent text. It depends on the amount of text you have if the LLAMA can "pretend" sufficiently, but I'd say a single person do not produce enough chats.

Sure as an experiment, why not. But be warned how LLM works is that it will be pulling the entire wisdom from it's billions of parameters base (LLAMA) while trying to sound like the friend trained in LORA.

Except that one guy in google, nobody who works with LLM long enough is being fooled that this is some sort of magical Ai and not a text predicting parrot. And especially when we are talking about current LLM you can run at home 7b, 13b, even 30b. They do really go off the rails.

My 2 cents.

I used "resurrecting" because I couldn't think of a better word at the time, but obviously that's not what we're trying to do. It's not magical. A trained LLM won't BE him in any meaningful sense of the word. It will just spit out words that sound like him and show a spark of recognition when certain topics/characters are brought up and that's all that we're after. That's what we're after.

Please let your husband know I am sorry for his loss. I am bowled over by your care and concern for him and also understand the motivation to "play" in this arena.

I think this kind of thing, like any other coping mechanism, can be healthy or unhealthy depending on the level of attachment and/or delusion. As long as you're both in it with eyes wide open, it can be a healthy way to explore grief. There are two sides to every coin, it's certainly not worse than people who turn to drugs and alcohol, which is the way a lot of people deal with grief. Because it's new and seems on some level to not "feel" completely right, there's a lot of hesitance and rejection out there for it.

I think the advice here has all come from a good place and is allowing you to weigh your options. I don't think the choice is obvious. For decades we have watched videos and look at pictures of loved ones and have the capacity to figure out they are not really there. People in the distant past who didn't have pictures and especially videos would probably judge it "weird and unhealthy" to make it look like the person is alive on a screen. But most of us accept it now as part of our life and grief and mourning.

All you are doing here as long as it's done with healthy boundaries, is looking at a "picture" or "video" of their personality. If you can get it, it can be kind of neat. That's all I wanted to say although it's probably jumbled and a lot and a bit messier than I wanted. Prayers and best of luck to both of you.