topBtn
NEWS
Meet the latest news from METABUILD
  • Press release
제목
[Technology and products] Music AI technology and digital singer Vivizen from Metabuild

[Technology and products] Music AI technology and digital singer Vivizen from Metabuild

 

Currently, AI is developing around a Generative AI that captures and learns various types of patterns from a discriminative model that distinguishes data by finding and learning key patterns from data.

 

Generative AI is a technology that generates new data with features similar to the original by learning the characteristics of various contents included in the data through non-subject learning. This Generative AI is drawing particular attention in that it is possible to create digitized creations in various art fields such as literature, art, and music.

 

Under the theme of "2020 Top Strategic Technology Trends for 2022," Gartner introduced 12 strategic technology trends for growth, digitalization, and efficiency. Among them, Generative AI was selected as a major technology to expand digitalization over the next three to five years. Gartner predicts that digital data generated by Generative AI will expand from less than 1% at present to more than 10% in 2025.

 

In line with this trend of AI paradigm shift and future strategic technology trend, Metabuild made great achievements in the field of Generative AI in 2021. Metaubild has recently developed AI multi-tonal vocal technology that can be used in music fields such as K-Pop, which is hitting the world.

 

The technology called "MAI VOCAL" developed by Metabuild is a kind of Singing Voice Synthesis (SVS) system that consists of two models, Acoustical Model and Vocoder Model, to imitate the tones of various singers and generate natural and high-quality singing voices.

 

 

The Acoustic Model performs the function of generating Mel-Spectrogram, which is the size information of frequency components, by inputting the duration of lyrics, notes, and notes. It was developed based on the FastSpeech AI model that guarantees robustness against continuous words and spaces and controls voice speed and rhyme.

 

The Vocoder Model performs the function of generating a singing sound source waveform by inputting Mel-Spectrogram, the output of the Acoustic Model, and was developed based on the HiHi-GAN AI model with faster synthesis and learning speed than other approaches and good quality of synthetic speech.

 

In order to acquire and learn the voice data of singers required by MAI VOCAL, Metabuild acquired the voices of 92 singers divided into various voice characteristics according to age, gender, tone, and genre through the Artificial Intelligence Learning Data Construction Project of the Korea National Information Society Agency. In addition, 4,000 songs in which lyrics and MIDI information corresponding to the start and duration of the singer's pronunciation were labeled as notes were collected and processed from the acquired voices to construct vocal data for artificial intelligence learning. Based on the established vocal learning data, the AI multi-tonal vocal system was developed that can synthesize 100 vocal voices.

 

The AI multi-tonal vocal system developed by Metabuild is characterized by various vocal performances such as K-Pop dance, ballad songs, and children's songs in various tones for men and women in their teens and 50s.

 

Metabuild's MAI VOCAL system continues to evolve by learning songs and tones from various genres 24 hours a day on its music platform cloud.

 

In addition, Metabuild and Chilloen created virtual AI singer "Vivizen" with the concept of a clear tone of a woman in her early 20s among the singing tones that can be synthesized through the MAI VOCAL system. Vivizen was produced through a sophisticated 3D modeling process by collecting and analyzing various 2D images of women in their early 20s. From the initial planning stage, considering the characteristics of virtual digital singers, it was designed to naturally express the mouth shape, emotional expression, and body movements when singing. In addition, the whole body was produced through 3D modeling, enabling the expression of dance with choreography through motion capture technology (Riggling) for joint movement.

 

 

Metabuild plans to develop AI Singer Vivizen, applied with AI vocal technology, into AI digital human that is active in the metaverse, including social media activities, virtual singer activities, advertising, through Chilloen. In addition, while continuously developing AI multi-tonal vocal technology, it plans to develop additional various AI models such as AI composition/compilation to lead Generative AI technology that can be used as a service in the music field.​ 

 

Published: 31 Dec, 2021

목록