Application itself is pretty easy to operate. Basic functionalities and keybindings are explained in Help -> Controls. However, there are couple of more “complex” features that need clarification. See video tutorials also in https://www.youtube.com/channel/UCvZjzCkyqO3AP2hs45lct3A

Cheat sheats of keyboard / mouse shortcuts for printing

Getting started

Create new projects in Lyric Video Studio. Import lyrics or create blank. Of create templates for your favorite set of services

Start by creating a new blank project from File -> New project -> Create blank. App will ask you to save the project file, this is mandatory.

Track menu is opened with right click. Each track menu is context aware, allowing you to import items from other tracks, for example

You can drag and drop image, video, audio and text files (including .LRC) from your system to timeline. Items are positioned where you drop them and if you drop image item to video track for example, new track is created

You can also import video or audio by inserting new items to tracks. Menu to create new track is opened with left mouse click. In same menu, there are other track-related options

Adding tracks is easy. You can add track from context menu or by dropping content to timeline

Depending the track menu you opened, new track will be created before or after the clicked track. You can re-order the tracks at any time. Image plugins / Video Plugins are the 3rd party plugins that offer more ways to create images and videos to your project

Horizontally scrolling DAW-like timeline for media items

After you have created track (or want to add text items to pre-created lyrics track), just hover over any empty space in the track with your mouse, hold CTRL (control) and left click the empty spot on track to create new item. Menu will pop to right side of the screen, allowing you to choose the source for the item. Lyric track (and usually 3rd party plugin tracks) support importing items to track. That feature is accessed from the same menu where new tracks are created

Selecting and copying items

Selecting items from timeline is fast and easy

First, select items by holding down S and hovering over items you want to select. Selected items will go slightly brighter and you can see amount of selected items. Only one type of items can be selected at one, like three lyric items but not three lyric items and one image item. After you have selected items, you can either manipulate those just like single items or you can paste them by simply pressing ctrl + left mouse button on empty spot on the track. Context menu (right click) operations, like delete, affects all selected items. To clear selection, press X.

Importing lyrics

Creating a new project by importing lyrics is the most common use-case for the app.

  1. Start by creating new .txt file and copy your lyrics there.
  2. Lyric are imported line-by-line, so each line in txt file represents one lyric item in timeline. As you already know the song, you probably have some idea in your mind how the lyrics should be displayed. Edit the text file and break long lines and words that have big gap between them. You can do this later after importing, but more effort and thought that you use at this stage, less there is to do in the app
  3. Select File -> New project -> Import lyrics and select the file you created and saved
  4. Project and lyric items are now created and you can save the project to your desired location
  5. You can also do this later, by right clicking lyrics track menu and selecting “Import lyrics”. Remember that this will clear all existing items on that track
  6. See example file:

If you chose not to import lyrics as new project, there’s two other ways to do it after project has been created

  1. Drag & drop .txt or .lrc file to timeline
  2. Drag & Drop audio file to timeline, right click the item, select “Extract text”. This action will open audio-to-text extraction view, using Whisper-models. Notice that extraction speed depends on your GPU. Also, not all languages are supported. And just like human, AI can mishear the lyrics (ref: https://www.youtube.com/watch?v=gg5_mlQOsUQ) 😀 Extract text button is also found from audio item settings

Audio-to-text

Text extraction supports also OpenAi (basically Whisper-1) and ElevenLabs API’s. You need to get your own API key’s for those: https://platform.openai.com/api-keys and https://elevenlabs.io/app/settings/api-keys NOTE: You can use the same Elevenlabs key for the plugin also, but for security and isolation reasons, those need to be pasted separately. ElevenLabs also supports defining limits to keys, so you might want to create separate key for audio-to-text. ElevenLabs also seems to be providing generous free tier, 10k credits/month https://elevenlabs.io/pricing/api

Extract text from audio or video. No need for manual syncing. No manual downlaoding or models etc, application makes it easy for you
Extract text from audio or video. No need for manual syncing. No manual downlaoding or models etc, application makes it easy for you

Working with texts

After you import lyrics, all items are shown in grey background. This means that the lyric item is not yet “synced”. There are two ways of syncing lyrics at the moment (more ways may come up later)

Move items manually to their places
  1. Import audio for video file to your newly created project, this will make the timeline expand to desired length
  2. Starting from the last item, move item (or selection of items, hold shift and hover over items to select) by holding left mouse button and drag. You can scroll the timeline horizontally by pressing ctrl and using mouse wheel
  3. This method is bit tedious and not recommended. See the next chapter for much more fun and faster method!
Sync with “rec”

Most fun way of syncing lyrics is the rec & sync-button. After you have imported the lyrics, any gray items are syncable that way:

  1. Import audio (or video) to the newly created project
  2. ‘Arm’ the lyrics track with the little ‘R’ button on left side
  3. Press rec (o)
  4. You can see the lyric items starting to scroll in timeline, while the first unsynced item follows the playhead.
  5. When you hear that particular line in the song, press “Sync” button on top of timeline view and keep it pressed as long as you want that line to be visible
  6. When you release the “Sync” button, the next item will become the “syncing target” and keep repeating #5 until all the lyrics have been synced
  7. You can stop syncing at any time and continue from the last un-synced item
  8. Don’t worry if you missed a beat or the line was not long enough, you can manually fine tune the results from the timeline view or even reset the items to their previous state from lyrics track menu

1. Rec, 2. Sync

Sync lyrics manually in game-like fashion. Fst and easy way to sync your song, just play along and press space to sync next item
Arm track for lyric syncing

‘Arm’

Defining font

You can define the font for the texts in many levels. If you want the texts to look the same on all tracks, just define the font on project level, in projects settings (shortcut P). This will apply the same font definitions to all texts

If you need different type of font on all texts on signel track, you can override the project font definition from track settings (+ button on top left of track). You can either copy the project definition or start from the scatch.

Yuo can also define forn for individual text items. Just like for the track,m you can add font override for text items from item settings.

Fonts have wide variety of settings available, including using shader as font color or gradients.

Font background

You can define the background for the font in font settings. In practice, you define primitive for the item to be used as background. Width / Heigth for the primitive acts as padding in case of font background. You can even set the shader for the backgroung, creating all kinds of cool effects. Below picture has shaders on both font color and background

Karaoke mode

Here’s quick instructions how to achieve karaoke-style texts:

  1. Create and sync the lyrics
  2. Set item alignment to Left (top, bottom or middle). This is important step to avoid manual labour on next steps
    • Pro tip: You don’t need to set each item alignment manually, you can use multi edit for that. Either select all items in track with right click -> Select -> All or press +-icon on track and click “Edit track items”
  3. Adjust font etc settings, positions of the texts and everything, because next steps needs to be redone, if you need to change anything
  4. Duplicate (below) the lyrics track that is synced and ready
  5. Open track effects from the created track and click “Copy project font definition”. Note, that this is not needed if the original track already had track font override active
  6. Now, select one lyric item and tap “Word by word offsetting” or from multi edit, selet “Apply word by word offsetting”
  7. Fine tune the the start / end times if needed. Times are relative to total lenght of the item

3rd Party plugins install

Currently all you have to do is drop the dll file to plugin folder of the app and restart the software. You can open the plugin folder from Settings -> Plugins also. Add new plugins only from the sources you trust and read carefully instructions etc that you know what services does the plugin use and how much the usage costs per generated content. In settings, you can also define folder where to scan for new plugins. After adding the folder or adding new plugins, restart the app.

Using crossfades

  1. Start by dragging the image or video items in timeline so that they slightly overlap
  2. Click + icon of the track, choose “Track effects”
  3. Select the crossfade from the right side menu
  4. This crossfade affects all items in the track and crossfade time is determined by how much the items overlap

Common use cases and examples

Adding lyrics to existing video

Adding lyrics or subtitles to existing video is the most basic use case of the app. See example project here:

Unzip the file to your computer and start up the Lyric Video Studio. Open the .lvsp file and wait for the app to cache the media content

Creating still image ‘idea sketch’

You can also create video using existing audio and generate images to timeline to sketch out the ideas for video or pitch the video to other. Maybe even use it to clarify your ides to whoever is making the initial video for you.

You can now view still images as storyboard. Click on the bottom right of render screen to access stoyrboard more. Select desired column & row count. Select track you wish to see on the grid and press apply. When you move your playhead or playback the song, the view highlights the image that currently would be shown. Page changes automatically after last image. If you make changes to items in selected track, press Apply to load the changes (this will improve in near future)

Storyboard mode is accessed wtih Render-switch
Storyboard-mode allows you to plan your video ahead using images. THis saves money and gives more control to you

Create music video with text-to-video or image-to-video

Bit similar than ‘still image sketch’, you can use still images to generate videos, with Luma Labs Dream Machine or Runway ML plugins. Usually the results are better that way than doing just text-to-video. But also, you can do text-to-video as well, just import the audio and start creating. (example project coming soon). New integrations to image & video generating services will come up as they are made available!

Cut and edit you video footage

Lyric Video Studio is fully capable video editor for editing and combining your existing videos to regular videos as well. You can add captioning / subtitling and export the video on any frame rate or size

Opt-in DropBox support

Some plugins, like Runway ML & DreamMachine, require images used by the service to be uploaded to publicly accessible place. To help with that, you can opt-in for DropBox integration in View -> Settings -> Plugins -> Plugin Content Delivery. When you create video with the plugin, just set the local file as image source (hint: right click on image, then ‘Copy source path’). File will be uploaded to your DropBox account and temporary (4 hours) link will be created for video service to grab the image. Images will not be destroyed, so remember to clean up old images from app/LyricVideoStudio-folder from your DropBox

Fragment shaders

Lyric Video Studio supports also fragment shaders, with SkiaSharp shading language, aka SKSL. There’s some shaders included but you can create own shaders as well. When copying shaders from external sources, be sure to honor the original license.

Shaders can be used with any item that has color definition as well as with images and video. It is important to know that if you use shader that output transparent pixels (like ready made audio spectrum shader), you need to set the item blend mode on. Then try out what is the best mode for your shader.

Couple of good-to-know things for adding new shaders:

  1. If you want to blend in the shader with image, select Add parameter -> Add image. Usually the iImage parameter controls how much of the texture color is added to shader. Texture is selected from the image or video you apply the shader to. Editor adds also four other parameters.
    • iImageDimensions tells the pixes size of inputted image. In SKSL, texture pixels are evaluated with exact coordinates.
    • iImage is the texture
    • iImageScale vaue holds float value that you can input to iImage-parameter to adjust how musch of texture pixel color is added
    • iImageDimensionRatio tells the ration between width and height of the source image and is needed for getting the right pixel from screen. Example to get pixel for the relative coodrinates (uv = two dimensional vector, x & y, 0-1): float2 uvPix = uv * iImageDimensions * iImageDimensionRatio;
  2. If you want the shader to react with music, like the spectrum analyzer, select Add parameter -> music. That will add two parameters automatically:
    • iMusic receives texture, generated from last audio that is played when the shader is rendered. The texture is 512×1 with frequency powers as colors. To get the power of certain frequency, sum R, G & B and divide it by three, to get percentual value:
      float sampleR = iMusic.eval(float2(uv.x * 512, 0)).r;
    • float sampleG = iMusic.eval(float2(uv.x * 512, 0)).g;
    • float sampleB = iMusic.eval(float2(uv.x * 512, 0)).b;
    • float sample = (sampleR + sampleG + sampleB) * .33;
    • iMusicScale float can be used to boost this value if needed

Removing backgrounds

There’s three options in Lyric Video Studio for removing backgrounds from videos and one for images.

Using rembg

This option is the most advanced, using AI to remove the background. This doe not require high end GPU, but will be a lot faster with it. To use rembg, you must install additional software and libraries

  1. Go to Settings, expand “Remove background”
  2. Click install, this will install embedded python and rembg. This takes couple of minutes, maybe more on slower computers and network connections
  3. Installation will also prompt for Visual c++ redistributable installation. If you already have it on machine, you can click close.
  4. Installation includes embedded python (so it won’t mess your own python installation) and RemBg with required dependencies. Installation is done to documents/LyricVideoStudio
  5. Now the background removal should work when you right click video or image item and select “Remove background
  6. From settings, you can change the model used for background removal, see model descriptions from here: https://github.com/danielgatis/rembg. When using any model for the first time, RemBg will download the model first, so it takes some time at first attempt.
Background removal integrated, orks with images and videos

Remove color from video

This type of background removal is more classic, known also as green screen. Click the “Replace color with transparency” from context menu and UI with couple of simple settings will pop up. Choose the color by clicking the left image. Then adjust the threshold and smoothing to get the best results. Moving the slider will change the preview frame. This operation preservers the original frames. You can adjust the settings again later, if needed.

Remove color from video. Green screen is very much recommended but this can be used as effect also

Use video mask

If you have video mask available for your video, you can use it to remove background. Click on the “Apply video mask”, then select the mask from the center view and click “Process”. Like in color removal, you can change the preview frame. You also have an option to invert the mask.

Apply mask to your video. If you have vide with alpha mask, you can use it to make your video transparent

Object detection and motion tracking

  1. Right click on video item and select “Generate motion detection data”
  2. There are several option to adjust object and motions detection
    • Motion threshold: How much the objects change between frames
    • Area size filter: how big or small objects are detected. Setting “Min” to 300 means the object must be 300 square pixels in size or larger. If it is smaller, it will be filtered out. Same for “Max”, but the opposite. Setting either value to 0 means that filter is not active
    • Same object distance filter: How much is the object allowed to move between frames to be considered as same object. Unit is pixels.
    • Same object size filter: How much is the object allowed to shrink/grow between frames to be considered a same object. Unit is square pixels
    • Object frame count filter: In how many frames the object must appear. Object might be detected, but only for few frames, so those usually should be filtered out
    • Folder to store diagnostics image: Useful when trying to find best settings for videos. Stores the temporary images to disk so those can be analyzed and tried out with different settings. Each frame produces two images. One image shows the black and white difference between frames. The other image displays detected objects drawn as a green outline. This however slows down the process of finding objects
  3. Click Find objects to go through the video
  4. Object data will appear after the process is completed. You can delete the objects, jump to the frame it appeared or perform more accurate tracking.
    • Tracking the object is separate process, done with different algorithm than “Find objects”. This usually gives better results, but the process is slower. Therefore, it is not done to all objects during “Find objects”.
  5. You can also mark objects manually and then track them. Just click “Mark object manually.” Give it a name. Then, left click, hold, and drag over the image to select the object. Then, click “Track”
  6. After you are satisfied with the objects created and tracked, close the view and click on item you with to follow that object. It has to be between the video item and on track below it
  7. Expand the “Motion syncing”, select the video that you wish to follow. Then select the object. You can adjust the alignment of your item. You can also set the start and end frames. That means you item will remain stationary outside those frames. Clicking the small click icon will set the frame number automatically from current playhead position
Object detection and tracking is natively supported by Lyric Video Studio
Sync any item to objects in video

Adding timer / countdown to text

You can add timer in middle of text using these two special formattings:

  1. timer(mm\:ss\:ff)
  2. countdown(mm\:ss\:ff)

These are the supported formats:

dThe number of days in the time interval. This element is omitted if the time interval is less than one day.
hhThe number of hours in the time interval, ranging from 0 to 23.
mmThe number of minutes in the time interval, ranging from 0 to 59.
ssThe number of seconds in the time interval, ranging from 0 to 59.
fffffffFractional seconds in the time interval. This element is omitted if the time interval does not include fractional seconds. If present, fractional seconds are always expressed using seven decimal digits.

If you wish to set start time other than zero, set up the time in the same format that was in the text. For example mm\:ss:\ff-format in start time would be 01:00:00 to count down from one minute. TIme goes to “Initial time” text field just below the text. With countdown, it’s mandatory for it to work properly

Large Language Model

LLM is supported natively in the app. Use case is to convert lyrics to descriptive prompts for plugins to process.

  1. LLM Initial instruction: This works as “setup” for the language model and guides
  2. Lyrics tracks that you can import lyrics to #3
  3. Your prompt / imported lyrics. Beginning should be what you wish LLM to do for you. Default is: ‘Analyze these lyrics and create storyboard base on that. Prompts should be highly detailed and descriptive: ‘ <lyrics here>
  4. Prompt history and adding new ones: Whenever you add new or duplicate current prompt, previous one is move to history. You can switch back and forth with < > buttons. History and prompts are save on project level, not in application settings
  5. Model tools: You can download curated models here. Mistral and PHI-4 are currently supported. DeepSeek support will come later. After the prompt has been finished, you can prepare the output for export. ‘Preserve lines staring with’-box allows you to customize what is preserved in cleanup. You can also undo the cleanup if needed.
    • Model tools now include also text box that allows you to sync prompts to correct places. First, mark the lines that contain the original lyric in prompt output. If the lyrics are wrapped with ” “, add “*” to end of text. When exporting to track, app will try to find the correct lyric items for the prompt. (see picture below)
    • Make sure that you lyrics don’t end up with ., that will interfere the detection’s (will be fixed soon)
  6. Model settings:
    • Model folder, where to look for then. This setting can also be changed in Settings-menu. Default value is taken from there in that case. Selected model shows the models in the folder.
    • GPU Layer Count increases the performance of LLM in the cost of memory consumption. If this values is too high and GPU runs out of dedicated memory, things will slow down. Keep Task manager opened while trying out optimal values for this.
    • Temperature: This is basically the “artistic freedom” of the LLM. Higher the value, more it makes things up. Lower values means it does exactly what is was taught.
    • Lastly, when ready, press generate to load the model and wait for results. It has been noticed that sometimes first answer is bland or just repeats part of the lyrics. In that case, just press it again. By design, this implementation of LLM does not preserve the context of previous ‘chat’ you had with it. It is good to know this. So, each generation starts of oblivious what was previously requested. This might change if the future and if I made a stupid design decision, let me know 🙂
    • You now have an option to choose to use Mistral AI from cloud. Just select “Mistral API” from models list and paste your API key from https://console.mistral.ai/api-keys/ At the time of writing (18/02/2025), using their latest model is free, if you don’t mind sharing your prompts with the French 🙂
  7. LLM output: This is where the LLM will output. Output varies between models and even between generations. When exporting, each new line is one item in timeline, just like when importing lyrics to track
  8. Output settings: You can export the output to selected plugin track. IN case the track is capable of doing both images and video, there’s selection which one you like to have.
    • Item length ms: How long is each item. If you are making videos, 5000ms or 9000ms are good starting points.
    • Gap between: how much empty space between generated items
    • Pro tip: LLM’s tend to output different results each time. Do multiple generations for the same lyric set, always exporting the results to image track. Then generate the images. While waiting, you can iterate more rounds with LLM. After you have enough images to fill all the gaps, start fine tuning!
Large Language Model enables you to transform your lyrics quickly to image ideas. It gives you a great starting point for your video

Integrated Stable Diffusion

Stable Diffusion is now integrated (as a plugin) to Lyric Video Studio. Integration includes presets for supported model types. SD 1.5, SDXL, SD3 medium and Flux are supported. Also upscale and LoRa’s are supported. From Track settings, you must first choose the folder, where your models are. Recommended folder structure would be to place different types of models to subfolders and LoRa’s for those to subfolder as well. For example c:\models\sd1.5 and c:\models\sd1.5\loras. In this case, c:\models should be set as model path in General Settings of this plugin.

Integrated Stable Diffusion support for you with RTX30xx series GPU
  1. Model selections. Not all types require everything, here’s list of needed models with links
  2. Select your device. Currently only Nvidia / cuda and cpu are supported. Using cpu can be slow! On first run, application fetches required libraries. If you already have Cuda toolkit 12.x installed, this step is skipped.
  3. After launching application, first generation loads the model to memory and releases it after timeout. Timeout is changed from plugin setting. If you wish to release the GPU memory right away, there’s a button for that.

Effects

This section goes through some of the more advanced techniques that you can do with regular effects and fragment shaders

Basic effect system

Number transitions are the basis of the effect system and you can do wide variety of transitions and effects with it. In practice, it allows you to animate all values that are editable. Here’s basic walk-through of the essential tricks with number transitions

  1. Transition type
    • ‘Current + delta’ means that the ‘Delta change’ value is added to current value of the item, at that time. For example, move item -25px during the time on effect
    • ‘Start to end’ allows you to choose the starting and ending value for the transition
  2. Random range makes the delta value variate the amount you define. Random seed is per item and loop. This adds some variation to item effects, for example in case of moving the item in screen. Example values of delta -25 and random range 50 makes the delta variate between -25 – 25
  3. Start / end time
    • Relative effect start / end starts the effect when based on the length of the item. This is useful when you have many items that are relatively same length and want to transition them consistently
    • Absolute means the absolute time how long the item has been rendered. For example ‘start transition at 400ms after item appeared’. This mode allows also negative values. -400 = Start transitioning when there’s 400ms left of item to be displayed’. This also works for end value. In the example below, there’s 10ms effect looped 20x times, starting when 400ms before item end
    • Static means that effect is applied immediately to item and stays on for the duration it is visible. This should be mostly used for effects like blur or color transition (make some of the texts different color without assigning font override to it)
  4. Reverse towards end means that delta value is transitioned back to original when nearing towards end. Effect start time is used on that case. This is equivalent of ‘in/out’ effect and useful for example in opacity transition
  5. Loop effect makes the effect loop as many times that count is. For example blink color of the item
  6. Continue after end means that the delta is kept added to item as long as it’s visible. This is used to define for example continuous rotation for the item. You would achieve same by adding ridiculously high delta value, but it’s easier to define it like ‘rotate 20 degrees in second and keep going’
  7. Easing curve dictates how the value is transitioned to desired value, see https://easings.net/ for reference and examples
  8. Reset delta/end random seed (also for start) resets the random seed assigned to item when it is first time animated with random value. Random value is stored so that randomized animation are consistent between runs and render. You can get new random with the button, if the effect is now “strong” enough.
  9. Override parent effect means that if same effect is applied on track or project level, this effect instance is used instead, if checked. Project and track level effects allow more consistent way of transitioning items. But sometimes, you might want one specific item to get bit different sort of transition to mix things up.
Lyric video studio allows flexible and extensive value transitions with easing curves. Create all types of transitions, store effet presets

Partial blur

This can be achieved by combination of two tracks, blur effect and mask.

  1. Start by creating image track and add image of your choosing to that track
  2. Duplicate the track below the first track
  3. Click on the image item on timeline to open settings and select “Effects” -> “Add new effect” -> “All” -> “Blur”
  4. Set start & end types of both sigmas to static. Adjust “Delta change” to make the image more blurrier.
  5. Next, to the same track (duplicated, last one), click “Edit masks” -> “Add new mask” and define the mask properties.
  6. When closing the mask view, you can see that the blur now affects only the masked area
  7. Note that you can animate mask properties to make it moving, if needed
Blur effect to make whole image or part of the image blurred

Radial blur

Radial blur can be achieved by using fragment shaders. Radial blur can be applied to text as well. Choose by clicking video or image to show the effects

  1. Expand “Shader” in image settings
  2. Click “Select shader” -> “Effects” -> “Radial blur”
  3. Use “Blur strength”, “Blur center % X” and “Blur center % Y” to adjust the effect power and center point. These values can be animated also.
Radial blur with fragment shaders. Adjust shader values dynamically

Color transfer

You can match colors between images and videos to make separate takes to look more consistent.

  1. Start by adding Color correction-plugin track (image or video), we’ll use video in this example
  2. Import videos from track
  3. Select reference video from dropdown and select reference frame frame. This frame will be used to all videos in this track
  4. Optionally click “Preview” to convert the selected frame (left-most slider)
  5. CLick generate (or generate all) to do final color conversion
Color transfer between videos supported. Make your shots look more consistent

Project collaboration / sharing / plugin content delivery

If you have opt-ed in to either DropBox or Google Drive (Settings -> Plugins -> Plugin Content Delivery), you can share your project via cloud. First, start by choosing your choice of service:

Opt-in for DropBox or Giigle drive for easier material share with 3rd party services as well as simple project sharing

Clicking the ‘Refresh Credentials’ will choose the service and open up your browser for login. Application receives temporary access token you your cloud service. The service is used for two purposes:
1. Uploading images / videos for the purpose of image-to-video or video manipulation. Many video service providers require this for creating videos from images.
2. Storing the archived project for easier sharing. Open your project settings (P or from View menu), then click “Archive and upload”. After project archive is created and uploaded, the public link will be displayed on the text box

Archiving and sharing is easy with Lyric Video Studio

You can copy the link by clicking the box and then send to your collaborator or band mates, for example. Link is pasted to popup that is opened from this menu

You collaborators, band mates or customers can open the archived project, even with trial mode
Paste shared url here to download the archive

Note that trial version of the app can also open the archive. None of your access tokens (of plugins) is shared with the project. Your prompts however will be shared, along with everything else you see on the timeline. Also note that if you opt-in for DropBox, the link is valid only for four (4) hours