It’s possible during the process of installing different things you have to close and reopon your powershell or command prompt window. In case you run into a lot of trouble of commands not working just restart your PC to be safe.
Open start menu
Search for “Edit the system environment variables”
Open the “Environment Variables” and make sure these are there or add them to the User variables (double click to open PATH or click Edit) NOT system variables
Open a command prompt for the following commands
WhisperX >> used to transcribe and align
pip install git+[<https://github.com/m-bain/whisperx.git>](<https://github.com/m-bain/whisperx.git>)
Ffsubsync >> Syncs the subfile either with video or other srt file
pip install ffsubsync
Pytube >> downloads youtube video and audio
pip install pytube
Torch, torchvision, torchaudio (Check latest supported cuda version)
pip install torch torchvision torchaudio --index-url <https://download.pytorch.org/whl/cu121>
Check if GPU processing is available
import torch
print(torch.__version__)
print(torch.cuda.is_available())
python .\\torch_version.py
2.3.1+cu121
True
There are several ways to make use of whisperX to create subtitle files for whatever you are watching, or even listening to. I will list different possibilities down below which you can choose to follow and/or combine.
There are different options to use for the command, in case you are interested in that please look them up online.
Put the file you want to let whisperX transcribe into a folder of your preference. From within that folder open a PowerShell window (SHIFT + Right Click in the folder or ALT + D and type CMD) and paste the following command.
whisperx "filename" --model large-v2 --language Korean --batch_size 1 --compute_type float32 --device cuda
What this does is create a .srt (subtitle file) and some other files based on the specified file. It is possible the subtitles dont match the timing, and there are some possibilities to fix that which I will describe in other sections below.