Kohya_ss Installation Troubleshooting for 50-series Nvidia GPU

 Gemini
Conversation with Gemini

I’m trying to install kohya_ss so I can train a LoRA using a local PC with a local training set, but it cannot find my existing Python install. I have sd.webui and kohya_ss placed under my user folder, and have Python 3.13.3 installed.

This is a very common issue when trying to set up Kohya_ss, especially on a system that already has other Python-based applications like sd.webui (Automatic1111). Here's a breakdown of the problem and the most common solutions:

The Problem: Python Version Conflicts and PATH Issues

Kohya_ss is often very particular about the version of Python it needs. While you have Python 3.13.3 installed, many sources and user experiences indicate that Kohya_ss has specific requirements, often needing a version in the 3.10.x range (e.g., 3.10.9, 3.10.11), and sometimes with an upper limit of less than 3.11.0.

Additionally, even if you have a compatible version, the installation scripts for Kohya_ss might not be able to find it due to an improperly configured system PATH variable. This is especially true if you have multiple Python installations, as the script might be looking for a specific one or getting confused.

Solutions to Try

    Use the Recommended Python Version: The most frequent and successful solution is to use the Python version that Kohya_ss is known to work with. The most commonly cited version is Python 3.10.11.

        Uninstall your current Python: To avoid conflicts, it's often best to uninstall your existing Python 3.13.3 before installing the required version.

        Install Python 3.10.11: Download and install Python 3.10.11 from the official Python website. Crucially, during the installation process, be sure to check the box that says "Add Python to PATH." This is a critical step that helps the system find the right Python executable.

    Use a Virtual Environment (Recommended): The best practice for managing different Python applications (like sd.webui and kohya_ss) is to use separate virtual environments. This isolates the dependencies for each application, preventing conflicts.

        Create a Virtual Environment:

            Open a command prompt or terminal.

            Navigate to your kohya_ss folder.

            Run the command: py -3.10 -m venv venv (This assumes you have Python 3.10 installed and in your PATH. If not, you may need to use python -m venv venv and let it create the environment with your default Python installation, then switch the Python version if needed.)

        Activate the Environment:

            On Windows: venv\Scripts\activate.bat

            On macOS/Linux: source venv/bin/activate

        Run the setup script: Now that your virtual environment is active, run the setup.bat or setup.sh script. It will now install all the necessary packages and dependencies within this isolated environment, using the correct Python version.

    Check and Edit the PATH Environment Variable: If the above steps don't work, you might need to manually check your system's PATH variable.

        Search for "Edit the system environment variables" in your Windows search bar.

        Click on "Environment Variables."

        In the "System variables" section, double-click on "Path."

        Make sure the path to your Python 3.10 installation (e.g., C:\Python310 and C:\Python310\Scripts) is listed and is a high priority (at the top of the list).

    Try a Different Setup Script: Some versions of Kohya_ss on GitHub have different setup scripts. For example, some repositories have a setup-3.10.bat specifically for systems with multiple Python versions. Check the repository's README file for any alternative installation instructions.

Summary of Steps to Get a Clean Install

    Backup: Make a backup of your kohya_ss folder in case something goes wrong.

    Uninstall Python: Go to "Add or remove programs" in Windows and uninstall any Python installations you have, especially 3.13.3.

    Install Python 3.10.11: Download and install Python 3.10.11, making sure to select "Add Python to PATH" during the installation.

    Re-clone Kohya_ss: To start fresh, delete your existing kohya_ss folder and re-clone the repository from GitHub.

    Run the Setup Script: Navigate to the new kohya_ss folder in a command prompt and run setup.bat. This should now correctly detect your Python installation and proceed with the rest of the setup.

Show me what appropriate options I should choose for the install after successfully finding the Python 3.10.11 and downloading dependencies.

Excellent! Getting past the Python installation is the biggest hurdle. Once you've successfully run setup.bat and it has detected Python 3.10.11 and is downloading the dependencies, you'll be presented with a series of questions. The options you choose will depend on your specific setup and needs.

Here are the typical options and the most appropriate choices for a local PC with a local training set:

Initial Setup Options (after dependencies download)

    Do you want to install previous versions of torch and associated files...?

        Choose 1 (Yes) if you are upgrading from a very old version of Kohya_ss.

        Choose 2 (No) if this is a fresh install. For a new installation on a clean system with the correct Python, choosing No is the recommended option. It prevents the script from trying to uninstall things that aren't there, which can sometimes cause errors.

    Please choose the version of torch you want to install:

        Choose the version with CUDA support. For a local PC with an NVIDIA GPU, you must choose a CUDA-enabled version of PyTorch. The options are usually something like:

            [1] - v1 (torch 1.12.1)

            [2] - v2 (torch 2.0.0)

            [3] - v2 (torch 2.1.0)

            [4] - v2 (torch 2.2.0)

        Recommendation: Go with the latest stable version that has CUDA support. The most recent versions of Kohya_ss work well with Torch 2.0 and later. If you have any issues, you can always re-run the setup and try an older version, but start with the newest.

    Do you want to install xformers?

        Choose 1 (Yes). xformers is a library that provides memory-efficient attention, which can significantly speed up training and reduce VRAM usage. It's highly recommended for most users.

    Do you want to install bitsandbytes?

        Choose 1 (Yes). bitsandbytes is essential for training with 8-bit or 4-bit quantization. This is a crucial feature that allows you to train larger models or with higher batch sizes, even on GPUs with limited VRAM (e.g., 8GB).

    Do you want to install lion and adamw8bit optimizers?

        Choose 1 (Yes). These optimizers can be more efficient than the standard Adam optimizer and are often used for training LoRAs. They can provide better results or faster training times.

Post-Installation

Once the setup.bat script finishes, it will typically present you with a menu of options for running the GUI.

    Option for the GUI: Look for an option that says something like "Start the Kohya_ss GUI" or "Launch the GUI." This is the main interface you will use for all your training, and it will open in your web browser.

    Accelerate Configuration: After the initial install, you will likely need to run the accelerate config command. This is a one-time setup that tells Kohya_ss how to use your hardware.

        Command: accelerate config

        Prompt-based setup: Follow the on-screen prompts.

            Do you want to use a notebook? -> No

            Do you want to use a distributed training environment? -> No

            Which platform are you using? -> This machine

            What GPU do you want to use? -> All (unless you have a specific reason to choose a single GPU)

            Do you want to use DeepSpeed? -> No (unless you are a very advanced user with a specific use case)

            Do you want to use mixed precision? -> fp16 or bf16. This is a critical choice for performance.

                fp16: Standard half-precision. Good for most NVIDIA cards (RTX 20, 30, and 40 series, etc.).

                bf16: Bfloat16. This is generally better than fp16 but requires a newer GPU (RTX 3060 and up, RTX 40 series). Check if your GPU supports it. If you're unsure, fp16 is a safe and reliable choice.

After completing the accelerate configuration, you're ready to start training. The final step is to run the GUI from the command prompt using .\gui.bat to access the training interface.

I am presented with the following options:

Based on the image you've provided, you are now at the main menu of the Kohya_ss setup script. This menu gives you control over the final installation and launch.

Here is a breakdown of what each option does and the most appropriate steps for a standard installation:

Recommended Steps

    Select 1 to "Install kohya_ss GUI".

        Action: This is the most crucial step. It will install all the necessary Python libraries and dependencies that the Kohya_ss GUI needs to run. It's the core of the installation.

    Select 2 to "Install CuDNN files".

        Action: CuDNN (CUDA Deep Neural Network library) is a performance-tuning library from NVIDIA. Installing the latest supported version can improve training speed. It is a highly recommended step for a standard setup.

    Do NOT select 3 to "Install Triton 2.1.0 for Windows".

        Action: As the menu states, this is a dangerous option. Triton is another performance library, but it can be unstable and is not required for a typical user. Stick to the stable libraries unless you have a specific, advanced reason to use it.

    Do NOT select 4 to "Install specific version of bitsandbytes".

        Action: You should have already installed bitsandbytes during the initial setup process. This option is for power users who need a specific, non-standard version. For a normal install, you can skip this.

    Select 5 to "Manually configure Accelerate".

        Action: This is the critical step for telling Kohya_ss how to use your GPU. You must run this at least once. It will launch the accelerate config prompt-based setup. The appropriate choices for this are:

            This machine

            No for distributed training

            All for which GPU

            fp16 or bf16 for mixed precision (as discussed in the previous answer)

        Note: You can run this step first, then go back and run steps 1 and 2, but it's often more intuitive to install the core components first.

    Select 6 to "Launch Kohya_ss GUI in browser".

        Action: Once you have completed all the necessary installation steps (at least 1, 2, and 5), this option will start the server and open the Kohya_ss interface in your web browser.

Your Action Plan

    Type 1 and press Enter. The script will install the core GUI components.

    Type 2 and press Enter. The script will install the CuDNN files.

    Type 5 and press Enter. This will start the accelerate config process. Follow the on-screen prompts carefully, especially for the mixed precision setting (fp16 is a safe bet).

    After accelerate config finishes, you will return to the main menu.

    Type 6 and press Enter. This will launch the GUI. The command prompt will show the server starting up, and your web browser should open to http://127.0.0.1:7860/ or a similar local address.

If you ever need to restart the GUI, you can run the gui.bat file directly from your kohya_ss folder, or re-run the setup.bat and select option 6.

The script appears to get stuck when I choose ‘5’. If it helps, I have a 5060Ti.

You've hit a common but frustrating point. The accelerate config process freezing at a certain point is often related to how the script is trying to query your GPU. It's trying to talk to the NVIDIA drivers and the CUDA libraries, and if there's a problem, it can hang indefinitely.

Your RTX 5060 Ti is a very new card (based on the Blackwell architecture), and there can sometimes be compatibility issues with older versions of software. Here's a troubleshooting guide for what to do when accelerate config gets stuck.

Common Reasons for the Hang

    Driver or CUDA Toolkit Issues: The most common culprit is a mismatch or problem with your NVIDIA drivers or CUDA toolkit installation. Even if you have the latest drivers, sometimes they aren't fully compatible with the specific version of PyTorch or accelerate that Kohya_ss is using.

    Conflicting Software: Other software that uses the GPU (like games, monitoring tools, or other AI applications) can sometimes lock GPU resources, preventing accelerate from properly querying it.

    Permissions Issues: Less common, but sometimes the script doesn't have the necessary administrative privileges to access the GPU information, especially on Windows.

    Mixed Precision Prompt: The script might be freezing when it gets to the fp16 vs. bf16 question because it's trying to detect your GPU's capabilities. Your RTX 5060 Ti should support both, but the script might be having a hard time getting the information.

What to Do When It Freezes

    Force-Quit and Re-run in a Clean Environment:

        Close the command prompt window where the script is running.

        Reboot your PC. This is a critical step. It ensures that no other software is using GPU resources and that your drivers are in a clean state.

        Open a new command prompt.

        Navigate back to your kohya_ss folder.

        Try running setup.bat and selecting option 5 again. See if it hangs at the same spot.

    Bypass the setup.bat Menu:

        Sometimes, running the accelerate config command directly from the command line is more stable than running it through the setup.bat menu.

        Open a command prompt and navigate to your kohya_ss folder.

        Type the following command and press Enter:

        accelerate config

        This will run the same configuration wizard but in a standalone way. This can sometimes resolve the hang.

    Check Your Drivers and CUDA Version:

        Confirm you have the latest NVIDIA drivers installed. Go to the NVIDIA website and download the latest GeForce Game Ready or Studio Driver.

        You might also need to ensure your CUDA toolkit is up-to-date. While Kohya_ss bundles what it needs, an older system-wide CUDA installation could be causing conflicts.

    Choose a Different Mixed Precision Option:

        If you can get to the mixed precision question, try selecting fp16 first. While your 5060 Ti supports bf16, fp16 is a more widely supported and established standard. If fp16 works, you can always try bf16 later.

    Reinstall accelerate and PyTorch (Last Resort):

        If nothing else works, you might have a corrupted installation. You can try to force a reinstallation of the necessary packages.

        In your command prompt, navigate to the kohya_ss folder.

        You can try to use pip to uninstall and reinstall the packages, but a cleaner way is to delete the virtual environment folder that the script created.

        Look inside your kohya_ss folder for a directory named venv or something similar (it may be a hidden folder). If you find it, you can delete it. Then, re-run setup.bat from the beginning. This will create a fresh, clean installation.

Given your new GPU, the most likely issue is a subtle incompatibility. Directly running accelerate config is often the best way to get past this without the setup.bat wrapper causing issues.

I rebooted my system and tried running ‘accelerate config’ from the kohya_ss folder but it could not find the command specified. There is a script which specifies Python 3.10.11 and I’m wondering if replacing the text to specify the latest version may help.

This is a crucial detail. The command accelerate config is not a globally recognized command by your system. It's a Python script that lives within the virtual environment that Kohya_ss created. When you ran setup.bat, it created a special, isolated Python environment (often in a folder named venv) and installed all the necessary libraries, including accelerate, within that environment.

When you run accelerate config directly in your normal command prompt, your system looks for an executable file named accelerate in your system's PATH variable and can't find it.

The Correct Way to Run accelerate config

You must first activate the virtual environment before you can run any of the scripts or commands that live inside it.

Here's the correct process:

    Open a command prompt.

    Navigate to your kohya_ss folder.

        Example: cd C:\Users\YourUser\kohya_ss

    Activate the virtual environment. The command for this on Windows is:

        .\venv\Scripts\activate

        You will know it worked because your command prompt's line will change to show the virtual environment name at the beginning, like this: (venv) C:\Users\YourUser\kohya_ss>

    Now, run the command. With the virtual environment active, your command prompt knows where to look for accelerate.

        accelerate config

This should successfully launch the configuration wizard without the "command not found" error.

Regarding the Python Version Script

You mentioned a script that specifies Python 3.10.11 and wondered about changing it to a newer version.

This is a bad idea and will likely cause more problems.

Kohya_ss and its dependencies (like PyTorch and accelerate) are specifically built and tested to work with a very narrow range of Python versions, primarily Python 3.10. The hang you experienced with accelerate is not due to using an outdated Python version. It's an issue with the GPU and driver communication.

If you were to change the script to use a newer Python version (like 3.12 or 3.13), the following would happen:

    Dependency Failures: PyTorch and other key libraries would likely fail to install or run correctly because they don't have pre-compiled binaries for that Python version.

    Version Mismatches: Even if you could get the dependencies to install, you'd be introducing version mismatches between the training scripts and the libraries, which could lead to unpredictable errors and crashes.

    Wasted Time: You would spend a lot of time trying to fix a problem that is not the core issue.

Stick to Python 3.10.11. It is the recommended and stable version for this software. The problem is not with the Python version, but with how the software is configured to use your specific hardware.

Final steps to resolve your issue:

    Make sure you have the latest NVIDIA drivers for your 5060 Ti.

    Open a fresh command prompt.

    Navigate to your kohya_ss folder.

    Run .\venv\Scripts\activate.

    Then, run accelerate config.

This should resolve the "command not found" error and allow you to proceed with the configuration. If it still hangs at a certain point, let me know, and we can explore more advanced troubleshooting steps, such as manually configuring the accelerate file.

I was able to get to the configuration wizard using the steps described. Show which options to choose that are appropriate for the system:

file
These are the files that the user uploaded:
{"contentFetchId":"uploaded:image.png-0926bfd5-55de-4939-859c-fb772c39f4fc","fileMimeType":"image/png","fileName":"image.png","fileNameIsCodeAccessible":true}
{"contentFetchId":"uploaded:image.png-2e57bdc5-5cbb-4885-9359-6dd1797e89ec","fileMimeType":"image/png","fileName":"image.png","fileNameIsCodeAccessible":true}

Excellent! You have successfully activated the virtual environment and launched the accelerate config wizard. This is exactly where you need to be.

Based on your system and goal (local PC with a single NVIDIA GPU for LoRA training), here are the correct options to choose from this point forward.

accelerate config Choices

    "In which compute environment are you running?"

        The correct choice is already selected: This machine.

        Press Enter to continue.

    "Which type of machine are you using?"

        Choose: No distributed training

        Reason: You are training on a single PC, not across multiple machines. This is the standard choice for almost all home users. Using the arrow keys, navigate to the first option and press Enter.

    "Do you want to run your training on CPU only?"

        Choose: No

        Reason: You have an NVIDIA GPU (RTX 5060 Ti) and the entire purpose of this setup is to use it for training.

    "Which type of machine are you using?" (This is a repeat question from a different angle)

        Choose: All

        Reason: This tells accelerate to use all available GPUs. Since you have only one, it will default to that. If you had multiple GPUs, this would utilize all of them for training.

    "Do you want to use DeepSpeed?"

        Choose: No

        Reason: DeepSpeed is a complex library for large-scale distributed training on massive models. It is not necessary for LoRA training on a single consumer GPU and can introduce unnecessary complexity.

    "What mixed precision should be used for training?"

        This is the most important option for performance.

        Choose: fp16

        Reason: This is a safe and highly effective choice for a new GPU like the RTX 5060 Ti. It enables half-precision training, which significantly speeds up the process and reduces VRAM usage.

        Note: While your card technically supports bf16, fp16 is more widely compatible and stable with many models. bf16 can sometimes lead to issues with older models or scripts. Start with fp16; it will provide excellent performance.

Summary of Selections

    Compute environment: This machine

    Distributed training: No distributed training

    CPU only: No

    Which GPU: All

    DeepSpeed: No

    Mixed precision: fp16

After you make the last choice, the script will finish the configuration and save the settings. You will be returned to the command prompt.

From there, you are ready to launch the Kohya_ss GUI by running:

.\gui.bat

Or by re-running setup.bat and selecting option 6. Your system is now properly configured and ready for training.

How about, “Do you wish to optimize your script with torch dynamo? [yes/NO]”

That's an excellent question, and it's a newer feature in the accelerate configuration that many guides don't cover.

My recommendation is to choose no for now.

Reasoning

Torch Dynamo is a feature in PyTorch that is designed to optimize and speed up model execution. It works by "compiling" the model's code into a more efficient, hardware-specific format. When it works, it can provide a significant performance boost.

However, it is still a relatively new and experimental feature in the context of many specific training scripts, especially for things like Stable Diffusion and LoRA training. It can sometimes introduce:

    Instability: It may cause unexpected crashes or errors during training, particularly with custom scripts or specific features (like certain optimizers or schedulers) that aren't yet fully compatible with Dynamo's compilation process.

    Compilation Overhead: The initial compilation can take a while, and if it fails or has to recompile for different parts of the script, it can actually slow things down.

    Difficulty in Troubleshooting: If your training fails with Dynamo enabled, it can be very difficult to figure out the root cause because the error may be related to the compiled code, not your original script.

When to Consider "Yes"

You might consider choosing yes only after you have a stable, working training pipeline without it. If your training is running smoothly and you want to experiment with getting a few more frames per second or a slight speedup, you can try re-running accelerate config and enabling it.

Your Action Plan

For a new user setting up their environment for the first time, it's always best to start with the most stable and proven configuration.

    Choose NO at the "Do you wish to optimize your script with torch dynamo?" prompt.

    Complete the rest of the configuration.

    Proceed with your training.

This will ensure you have the most reliable setup to get started. You can then experiment with more advanced options like Torch Dynamo once you're familiar with the training process and know that your base setup is functional.

How about, “Would you like to enable numa efficiency? (Currently only supported on NVIDIA hardware). [yes/NO]”

That's another excellent and specific question. It shows that you're paying close attention to the details of the configuration, which is the key to a successful setup.

My recommendation is to choose NO for this option.

Reasoning

NUMA (Non-Uniform Memory Access) is a system architecture feature found on some multi-socket servers. In a NUMA system, a CPU can access its own local memory much faster than it can access the memory of another CPU on a different socket.

The "numa efficiency" setting in accelerate is designed to optimize training on these specific multi-CPU, multi-GPU server environments. It ensures that the GPUs on one socket are paired with the CPU on the same socket to minimize data transfer latency.

For a typical consumer PC, even one with a powerful CPU and GPU:

    Consumer CPUs are typically a single socket. You do not have multiple CPUs on separate sockets.

    The NUMA architecture is not relevant to your system. Enabling this option on a single-socket consumer PC is unlikely to provide any benefit and, in some cases, could introduce instability or errors because the system architecture doesn't match the optimization it's trying to apply.

Your Action Plan

    Choose NO at the "Would you like to enable numa efficiency?" prompt.

    Continue with the rest of the configuration.

For your specific use case (training on a single PC with a single GPU), this feature is not applicable and should be left disabled. This will ensure the most stable and reliable training experience.

Afterwards, can I close the wizard, run setup.bat and launch the webui?

Yes, that's exactly the correct procedure. Once you've successfully completed the accelerate config wizard, all of the necessary settings have been saved to a configuration file in your kohya_ss folder.

Here are the final steps, assuming you've completed all the prompts in the wizard:

    Close the accelerate config wizard.

        Once the final question has been answered, the wizard should exit on its own, and you'll be returned to the command prompt line.

    Return to the setup.bat menu (Optional but recommended).

        The most straightforward way to launch the GUI is to re-run setup.bat from your kohya_ss folder. This ensures that any final checks or preparations are completed before launch.

    Choose option 6 to "Launch Kohya_ss GUI in browser".

        This will start the server. You'll see a bunch of text scrolling in the command prompt window as it loads all the models and libraries.

        Once it's finished loading, it will open your default web browser to the Kohya_ss interface, typically at an address like http://127.0.0.1:7860/.

Alternative Method (for future use):

You don't always have to go through setup.bat. For future sessions, once everything is installed and configured, you can simply:

    Open a command prompt.

    Navigate to your kohya_ss folder.

    Type .\gui.bat and press Enter.

This will launch the GUI directly without going through the setup menu.

So, to answer your question directly: yes, you are now ready to launch the webUI. Your diligent work in configuring the system properly has paid off, and you can now proceed to the exciting part: training your LoRA.


Gemini can make mistakes, so double-check it