Logo Logo
directory schraeg
Camcorders· Cinema-Kamera· Computers· Drohnen· GPU· Kamera-Zubehör· Video-DSLR· accessories
Compositing· Color correction· DV Editing

Shooting· Sound
Forschung· KI· Reviews· Streaming
/// News
New audio AI generates any sound effects in addition to music

New audio AI generates any sound effects in addition to music

[10:26 Thu,2.February 2023   by ]    

How fast the development in the field of AIs is progressing can be seen, among other things, in the field of "text-to-music", i.e. AIs that generate any music via text description: Google had just presented MusicLM (we reported), followed a few days later by AudioLDM, a research team from the University of Surrey and Imperial College, a very promising project, especially for filmmakers, because it not only synthesises pieces of music including instruments via text prompt, but also noises (SFX aka sound effects). AudioLDM can also produce entire soundscapes on request - ideal for sound backgrounds for films.

In addition, the AudioLDM team wants to make the programme and its model available online as open source, which means that it could not only be used freely on one&s own computer, but could also be improved by others and integrated into other programmes. For example, it could be used as a plug-in in video editing programmes such as Adobe Premiere or Blackmagic&s DaVinci Resolve to generate sound backdrops. Another argument in favour of using AudioLDM at home is that it is supposed to be very efficient (i.e. it requires relatively little computing power) and the training - for example, of your own sound samples - can be done using only one GPU (such as an NVIDIA RTX 3090).


In addition, AudioLDM has practical functions that are already known from the image AIs, such as InPainting (a part of an audio recording is replaced by another sound via text prompt to match the rest), Style Transfer (a melody is played by another instrument) or Super Resolution (i.e. in the case of an audio recording of music or speech with low sampling resolution, the resolution and thus the audio quality is increased via upsampling).

Here is an example of style transfer: trumpet to children&s singing

In addition to the description of the sounds that are to be generated, other parameters can be entered that affect the sound such as the type of acoustic environment (reverberation), the material of things that make sounds as well as the temporal order.

The sound of a steam engine:

Cutting meat on a wooden table:

For more complex soundscapes, the researchers enlist the help of the text AI ChatGPT, which, for example, responds to the prompt "Describe the sound of the universe" with a detailed description ("Radio emissions from stars, planets, galaxies and other celestial bodies, high fidelity, as well as the sounds of solar winds and cosmic rays"), which can then be used as a prompt for MusicLDM and generates the following output:

Model of AudioLDM

Actually, the source code was supposed to be published together with the research work on Monday, but the team is still reluctant to put the model (i.e. the result of the training process) online because of the just announced lawsuits against several image AIs due to copyright infringements, since the well-known BBC SFX library was used for training. Although this library may be used freely for non-commercial purposes, it is not clear whether this also applies to the training of AIs because the legal situation has not yet been clarified. After clarification, however, the code is to be published together with the model.

Examples of music generation:

More Audio AI Projects

The following demonstrates just how rapidly development in the field of audio AIs is progressing.

Audio AI Timeline

Within a few days, several text-to-audio AIs of very different quality have been developed, such as
Noise-to-Music and Moûsai: Text-to-Audio with Long-Context Latent Diffusion. The Chinese Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion Models project seems to us to be particularly worth mentioning, because it enables not only audio-to-audio but also image-to-audio and video-to-audio, i.e. sound is generated to match a video clip:

Link more infos at bei audioldm.github.io

deutsche Version dieser Seite: Neue Audio KI generiert neben Musik auch beliebige Soundeffekte


  Vorige News lesen Nächste News lesen 
bildSamsung Odyssey Neo G70C: 43" Mini-LED Monitor mit SmartTV-Funktionen bildUpdate: Geringere Latenzzeiten für den Accsoon SeeMo HDMI-Adapter

related news:1E0Zoom UAC-232 USB Audio Converter: 32-bit float audio eliminates the need for gain controls 26.February 2023
Blackmagic ATEM Television Studio HD8: New all-in-one live production mixer 24.February 2023
RØDE NT1 5th Generation - Studio microphone now with XLR/USB-C and 32 bit floating point 21.February 2023
WhisperX: Free audio transcription with speaker recognition 1.February 2023
Tascam Portacapture X6 - mobile 6-track audio recorder with XLR and 32bit float introduced 30.January 2023
DJI Mic: Compact 2-channel wireless microphone system now also available in cheaper solo version 13.January 2023
Zoom introduces new MicTrak 32-bit float microphone and recorder series 19.December 2022
alle Newsmeldungen zum Thema Sound
1E0Runway Gen2: New text-to-video AI from the Stable Diffusion creators 20.March 2023
VideoFusion: First open source video AI is here - and also runs on the home PC 20.March 2023
Version 5 of the image AI Midjourney delivers photorealistic images - and even the hands are correct 19.March 2023
AI turns WLAN router into room radar 18.March 2023
AI and Copyright- No land in sight. 17.March 2023
Only 15s per image: Image AI Stable Diffusion runs on the smartphone 14.March 2023
Wonder Studio: Integrate virtual characters into movies easily and cheaply via AI 12.March 2023
alle Newsmeldungen zum Thema Machine Learning

[nach oben]

Archiv Newsmeldungen


March - February - January

December - November - October - September - August - July - June - May - April - March - February - January























deutsche Version dieser Seite: Neue Audio KI generiert neben Musik auch beliebige Soundeffekte

last update : 20.März 2023 - 20:00 - slashCAM is a project by channelunit GmbH- mail : slashcam@--antispam:7465--slashcam.de - deutsche Version