cnnmmd

background

◯: the purpose

The purpose of this tool is to enable users to create a system that allows them to easily converse with their favorite characters. [※1]

◯: background

Character generators and image/speech synthesis models make it easy to create characters, while language generation and sentiment analysis models allow for natural conversations with characters.

Furthermore, the language generation model also generates network/server settings and code (programs) in various languages that connect characters and generative models.

◯: problem

However, applying these codes in the real world can be quite difficult - the real world is not as clean as language generation models make it out to be.

Furthermore, to be able to reuse that code requires a certain amount of knowledge - while there is a lot of overlap in the codes for different character types, the codes are often subtly different, making it difficult to reuse them.

To begin with, there is the problem that many people are reluctant to deal with the problems that come with installing an app or to write code in general.

◯: correspondence

This tool [1] uses containers to solve the problem of conflicts between applications, and [2] connects containers via a network, allowing for flexible configuration changes. [※2]

In addition, [3] all of the tools will be customizable, and they will be able to be created and published as [4] plugins. [※3]

Conversations and the associated triggers/actions can be created as a visual workflow [5]. On the other hand, to implement complex flows, [6] it is also possible to write code by hand (sequential, branching, and iterative).

All of these plugins, including containers and workflows [7], can be created, deleted, started, and stopped in the same way.

*1: There are many so-called AI chat apps and services, and some allow for character customization and replacement -- but what users really want is to converse with the face and voice of their favorite character. At least for personal use, it is possible to create the appearance and voice of your choice using generative models -- so the remaining problem is to implement a system for conversation.

*2: This includes not only the flow configuration, but also the location and functionality of the servers called from each node - the location of each server corresponding to a node can be local or remote, and the server functionality can be CPU/GPU or API calls.

*3: If you generate code for this environment from a language generation model, the environment itself is clean because it is separated by a container. You only need to generate individual code that is not shared, minimizing the effort.