cnnmmd

Introduction > Overview and Installation > Various Installations

Installation: Container

◯: Windows (WSL2 (Docker))

The content is being prepared.

◯: macOS (Docker Desktop)

The content is being prepared.

Installation: Various workflows (light version)

Here's a flow that will work on any regular computer (even if you don't have access to the service's API or a local GPU):

◯: Text-to-speech only

It's a simple flow that just makes the character speak from the text you input: [※1]

・: Server startup & workflow: cnnmmd_xoxxox_mgrcmf_cmf_txt_vox_001
・: Client settings - Start: cnnmmd_xoxxox_appcmf

Set it up and start it up in two easy steps:

$ yes | ./manage.sh create cnnmmd_xoxxox_mgrcmf_cmf_txt_vox_001
--depend
$ yes | ./manage.sh launch cnnmmd_xoxxox_mgrcmf_cmf_txt_vox_001
--depend

*1: This flow does not have auto-repeat (you need to press a button to run the flow after each text entry).

◯: Speech recognition and synthesis only

It's a pseudo-sentiment analysis and language generation system -- if what you say matches one of the keywords in the configuration file, the character will say the corresponding phrase (in the order listed or randomly):

・: Server startup & workflow: cnnmmd_xoxxox_mgrcmf_cmf_sim_wsp_vox_001
・: Client settings - Start: cnnmmd_xoxxox_appcmf

Set it up and start it up in two easy steps:

$ yes | ./manage.sh create cnnmmd_xoxxox_mgrcmf_cmf_sim_wsp_vox_001
--depend
$ yes | ./manage.sh launch cnnmmd_xoxxox_mgrcmf_cmf_sim_wsp_vox_001
--depend

A sample configuration file looks like this - pseudo-sentiment analysis is done using a list of keywords (regular expressions) to match:

{
  "except": "0",
  "dicsen": {
    "1": "happy|fun",
    "2": "Sad|Lonely"
  }
}

The pseudo-language generation is a list of responses to the results of sentiment analysis (the numbers "1" and "2"):

{
  "output": "random",
  "dictxt": {
    "0": [
      "you know"
    ],
    "1": [
      "I'm glad,"
      "It's fun."
    ],
    "2": [
      "Sad, right?"
      "It's lonely."
    ]
  }
}

Even if you don't have a sentiment analysis model or a language generation model, you can still create various conversation scenes yourself.

Installation: Various workflows (script version)

Here is the scripted version of the workflow (you can use it without installing the GUI workflow environment (ComfyUI)):

◯: Conversation in a web browser (script version)

The GUI workflow is actually just a set of nodes that call each function used by the script like this :

・: Server startup & workflow: cnnmmd_xoxxox_mgrpyt_web_tlk_wsp_vox_gpt_001
・: Client settings and startup: cnnmmd_xoxxox_tlkweb

Set it up and start it up in two easy steps:

$ yes | ./manage.sh create
cnnmmd_xoxxox_mgrpyt_web_tlk_wsp_vox_gpt_001 --depend
$ yes | ./manage.sh launch
cnnmmd_xoxxox_mgrpyt_web_tlk_wsp_vox_gpt_001 --depend

On the web browser side, set up and launch it following the client-side procedures above.

*1: Since each function is called directly from the programming language (Python), there are no limits to the degree of freedom (sequence, branching, repetition), and you can even create these functions yourself (Python class + dictionary format).

Installation: Various workflows (various characters)

Here is the flow for the different characters:

◯: Male characters

This is the conversation flow with the male character: [※1]

・: Server startup & workflow: cnnmmd_xoxxox_mgrcmf_cmf_tlk_wsp_vox_gpt_002
・: Client settings - Start: cnnmmd_xoxxox_appcmf

Set it up and start it up in two easy steps:

$ yes | ./manage.sh create
cnnmmd_xoxxox_mgrcmf_cmf_tlk_wsp_vox_gpt_002 --depend
$ yes | ./manage.sh launch
cnnmmd_xoxxox_mgrcmf_cmf_tlk_wsp_vox_gpt_002 --depend

◯: Unrestricted conversation

This is a flow with no restrictions on the content of the conversation: [※1][※2]

・: Server startup & workflow: cnnmmd_xoxxox_mgrcmf_cmf_tlk_wsp_vit_nai_001
・: Client settings - Start: cnnmmd_xoxxox_appcmf

Set it up and start it up in two easy steps:

$ yes | ./manage.sh create
cnnmmd_xoxxox_mgrcmf_cmf_tlk_wsp_vit_nai_001 --depend
$ yes | ./manage.sh launch
cnnmmd_xoxxox_mgrcmf_cmf_tlk_wsp_vit_nai_001 --depend

*1: Both require the service's API.

※2: It's one of the few services that doesn't have limitations beyond the local language generation model - when combined with a decent speech synthesis model, it may be usable even on a PC without a GPU, although there will be some latency and confusion (though the sample prompts are modest, so you'll probably need to modify them to suit your needs).