K-Pop Daisuki - Predicting Future success with music metrics?


After researching YouTube video behavior for over 5 years and utilizing statistical analysis to understand how videos perform in terms of views, likes, and most importantly, views per like on a daily basis, I have gained significant insights into the dynamics of music videos. It's worth noting that the majority of my research focused on Korean videos as my primary data source. However, this concentration has yielded valuable insights into the viewing habits of the Korean public and potentially the global fanbase of Korean music.

Alongside my research into YouTube music videos, I also delved into other metrics such as streaming (primarily focusing on Korean data), sales, and music show awards. Each metric possesses unique nuances, some of which are readily apparent, while others require careful mapping and consideration. In this article, I intend to share my findings and, most importantly, address a question that I have encountered numerous times from individuals or companies who have reached out to me expressing interest in some form of predictive system. It's important to note that this article is intended to be personal in nature, and I do not intend to present a thesis. Furthermore, I will aim to provide concise summaries wherever possible.

The source data.

For YouTube videos, there are two critical data points to consider. Firstly, there's the obvious view counter publicly provided by YouTube, which allows you to monitor daily views and plot a daily view count for each video. This process is relatively straightforward, and numerous platforms utilize the YouTube API for this purpose. However, it's important to acknowledge that a considerable portion (though not all) of high-profile music videos have inflated view counts due to ad-promoted views. These views are generated each time the video, as an advertisement published on YouTube's TrueView system, is viewed. It's essential to recognize that these views are not entirely organic; they may represent unwilling or accidental views. These so-called non-organic views, along with other non-organic views such as bot repeats, contribute to views with low or no validity. Despite this, they are included in the total view count.

Calculating non-organic views is a complex endeavor, and this article does not aim to delve into the statistical models that facilitate accurate estimation, such as View Fallout Behavior (the natural decline in viewership over time) or validation through the View per Like ratio. All videos are subject to a certain View per Like ratio, which is not constant but exhibits predictable changes as a video ages. Utilizing this ratio can be a valuable tool in identifying non-organic views, which typically do not translate into likes, as opposed to organic views, which have an expected conversion rate into likes. Fortunately, in October 2019, YouTube made the decision to exclude ad-promoted views from their global charts (a decision that was arguably overdue). This now provides a moderately effective means to identify ad-views. However, it's worth noting that other non-organic views remain undetected. By utilizing the weekly charts, we can subtract the views displayed on the chart (representing organic views) from the total public views of that week to calculate the non-organic views.

When it comes to streaming, Korea boasts a plethora of platforms, including Melon, Genie, Flo, Bugs, and several lesser-known ones. Some systems even integrate global platforms like Apple Music, Spotify, and YouTube. The most respected tracker in Korea, Instiz iChart, continually updates its index to include or exclude various streaming services. As of the time of writing this article, Instiz iChart tracks Melon, Genie, FLO, Vibe, Bugs, and YouTube Music.

One challenging aspect of streaming measurement is that each platform has its own user base, meaning that not all charts should be considered with equal weight. However, Instiz iChart prioritizes chart positions rather than the sheer number of streams or user base. For a more comprehensive analysis of streaming data, Circle Chart, connected to the Korean Music Industry Association, serves as the best source. Circle Chart offers three distinct streaming charts: 

  1. A raw streaming chart, which counts only streams from Korean companies
  2. A Digital chart, which balances streams with digital downloads, as well as BGM (background music) downloads for karaoke and ringtones (a popular trend among Koreans who enjoy customized ringtones of their favorite songs).
  3. The most recent addition is the Global Streaming chart, which, in addition to Korean companies, includes some international platforms such as Apple Music and Spotify.

The streaming chart is particularly useful for tracking raw streaming within Korea, which is of primary interest to Korean companies seeking the most listened-to Korean artists. The Digital Streaming chart, on the other hand, is sometimes preferred for a more comprehensive view since it incorporates digital downloads and other forms of digital media, providing a better understanding of an artist's reach. For example, a digital download is often equated to 500 streams. This is the chart utilized at Daisuki. The Global Streaming chart, while omitting digital downloads, includes international charts with the aim of measuring the international reach of Korean artists. However, it's important to note that this approach comes with a caveat: the data for Korean streaming is meticulously defined and tracked, so adding just a couple of global services may be perceived as incomplete. For instance, the chart is most useful for measuring international reach in countries where the included international systems (such as Apple Music and Spotify) dominate the streaming market. In countries where other services are prevalent, the data may be undercounted. Nevertheless, this approach represents the best available option without introducing excessive complexity.

Sales present a unique challenge, as illustrated by the two major companies that track sales: Circle and Hanteo. Circle counts any media that leaves the distribution center for stores or direct-to-customer deliveries as a sale. On the other hand, Hanteo only includes direct sales from outlets. Each approach has its own benefits and caveats.

Circle's method of counting sales results in the inclusion of unsold inventory that remains on the shelves of stores. This is because, for their purposes, once the items leave the distribution centers, they are considered sold. Consequently, if this stock is only sold weeks or months later, the sale will still be counted based on when it initially left the distribution center. Moreover, if stores provide free samples, have damaged products that cannot be returned, or retain stock for extended periods, these factors introduce inaccuracies into the count. Returns are also subject to delay since any returned items will first go back to the retailer and only be counted once they are returned to the distribution center. While Circle is known for covering 100% of physical media, it tends to overestimate final sales due to its inclusion of unsold stock.

On the contrary, Hanteo operates by requiring direct reports from retailers regarding sales. Companies must be affiliated with their tracking services and are obligated to promptly report their sales. Each reported sale is considered a sale to an end-user, and retailers are prohibited from including free samples or damaged material as sales, addressing the noise that Circle encounters. However, it's worth noting that broken material is likely returned to distributors for potential recycling of cases or other materials. Returns are also processed as quickly as possible. While this method may seem ideal, it does have significant deficiencies. 

Firstly, not all retailers are part of the Hanteo network, meaning that their sales go entirely uncounted. Hanteo acknowledges that between 10% and 20% of total sales are not accounted for. Another significant issue contributing to severe undercounting in Hanteo is the presence of online outlets. Many online stores ship directly from distribution centers to customers and may not even be aware of Hanteo's existence, particularly foreign shops that don't report data for Korea. Additionally, sales and return reports are rarely submitted on a daily basis; some companies report weekly, others monthly, and some lack a definitive schedule. Consequently, similar to Circle and its delay in clearing stock, Hanteo experiences a delay depending on how frequently retailers submit their reports.

In summary, Circle typically overestimates sales in the short term but tends to correct these inaccuracies over time. On the other hand, Hanteo underestimates sales in the short term, and even over time, its data still falls short of the total value. As a result, some tracking services that prioritize up-to-date data prefer using Hanteo, while others willing to wait for more precise values opt for Circle reports. This is also why Circle Charts are released with a delay, typically 4 days for weekly charts and approximately 10 days for monthly charts.

Music awards shows each have their own distinct methods for rating and awarding artists, a topic that could warrant an entire article in itself. However, to provide a brief overview: 

  • All awards shows typically utilize a method for streaming data, with some using individual charts and others relying on data from Circle Chart.
  • Physical sales are also taken into account, with some shows using data from Circle and others from Hanteo.
  • YouTube views, referred to as SNS, are a standard metric, and as of 2024, engagement on YouTube Shorts has been incorporated into some awards shows. However, the methodology for counting engagement on YouTube Shorts remains unclear due to the large number of shorts per song.
  • Additionally, awards shows often consider a broadcast score, which is based on the frequency with which an artist is featured on broadcaster programs, including the awards show itself.
  • Finally, there are typically a series of votes, including a pre-vote and a live-vote. 

The percentage weight assigned to each of these metrics varies widely among different awards shows, but streaming data usually holds the highest value.

The issue with Music Show Awards lies in their failure to fully grasp the nuances of all the metrics involved, despite acknowledging that streaming is the most crucial metric. For instance, YouTube views are often counted as totals, giving an advantage to overinflated ad-promoted videos. Sales are typically tallied weekly, meaning that only the first week after release tends to receive full credit, and ignore the fact that sales are directly correlated with fandom size and even gender demographics (with females generally purchasing more merchandise than males). 

The broadcasting value, while intended to differentiate awards shows, has faced significant criticism and has been embroiled in scandals in the past. Since the broadcaster controls when an artist appears on their programs, there is potential for manipulation of this value. Some artists who have had disagreements with the broadcast director of a particular station have reportedly been excluded from shows, resulting in a decrease in their broadcast score. In some cases, awards shows may even deem artists with zero broadcast points ineligible for consideration. 

Artists like Taeyeon and Lim Young Woong have been victims of this type of "bullying," although once the issue has been brought to light and the broadcaster punished, Music Show Awards understandably do not revoke awards already given. Such actions would be deeply unfair to the artists who already won the award.

In fact, The chaotic nature of Music Awards Show predictions and winners is a very good point on why metrics alone are a poor system to evaluate songs. In the end, no matter how a Broadcast puts it and formulate their score systems, all Music Awards are just a variation of a Popularity award.

Predictor Potential of metrics

So, for instance, how can we tell if an artist will succeed just by looking at their numbers? It's like that saying about the stock market: "Past success doesn't guarantee future gains." But let's dig into all the data regardless.

Youtube

The initial point to clarify is that the total number of views cannot be relied upon. Non-organic views must be disregarded initially, as both studios and individuals can initiate campaigns and allocate funds for ad-promotion of any available video. Once we've excluded these views from consideration, we can delve into the discussion on views.

YouTube serves as a problematic metric for gauging success, primarily because it showcases music videos rather than the raw song itself. Music videos have historically been crafted to provide a visual element aimed at captivating television (and now internet) audiences, rendering them uniquely engaging. Take, for example, "Gangnam Style," the record-breaking Korean music video. Despite its immense success, this didn't necessarily translate into equivalent profits or attention for PSY. The album's sales remained average, and while the song was undeniably catchy, it only briefly topped streaming metrics without leaving a lasting impact. The video's clever direction and plot attracted millions of views and replays, but it didn't necessarily convert viewers into new fans. Many people still perceive PSY as a one-hit wonder, despite his other well-known songs in Korea. Furthermore, YouTube's international reach means that a high view count doesn't automatically equate to global recognition. Views may be concentrated in specific countries rather than being reflective of worldwide popularity. Notably, a significant portion of Korean Music Video views comes from Southeast Asia. Given Korea's relatively small population, its YouTube videos are susceptible to being heavily influenced by international viewership. Japan, for instance, with twice Korea's population, is a known K-Pop enthusiast.

Streaming

Streaming serves as a highly effective metric for gauging interest in the music itself. People will only listen to what they enjoy, regardless of visuals or other factors like idol status or the quality of the music video. Therefore, in any serious analysis, streaming data (along with other digital media metrics) is considered the most meaningful. However, streaming data is still subject to various external influences. For instance, a high-quality song may initially go unnoticed, only to be "discovered" and gain popularity later on. Examples of this phenomenon include Younha's "Event Horizon," which took seven months after its release to become a major hit, and BIBI's "Bam Yang Gang," which took almost a month. Additionally, there are songs that eventually become famous despite having consistent streaming metrics without a significant peak. One notable example is Brave Girls' "Rollin'," which resurfaced due to a viral Army video but had already gained widespread recognition and popularity by that point. It's worth noting that there are numerous songs that achieve similar levels of fame without necessarily having standout streaming metrics.

Sales

Sales in the 21st century present a complex landscape. After a significant decline in sales during the late 2000s, Korean sales rebounded rapidly, experiencing nearly exponential growth year by year as of 2024. Even internationally, physical sales have been making a comeback, although at a slower pace compared to Korean sales. However, the strength of physical sales in Korea is uniquely attributed to its Idol culture.

The fervent fans of an idol group or artist are typically eager to acquire merchandise, and owning a collection of physical albums is one of the most tangible displays of support and memorabilia. Many fans refrain from even opening the packaging, preferring to preserve it as a mint-condition collectible for display, especially since they can access the music, often in higher quality, through streaming services. Consequently, the size of a fanbase, as well as the quality of the physical packaging and additional extras, significantly drive up sales. It's noteworthy that record sales are consistently linked to idol groups, whereas "regular" singers, even those with an idol background, tend to have weaker physical sales. The top singers in Korea often lag significantly behind the top idol groups in terms of sales numbers.

Metric Relations

Another crucial factor to consider across all metrics, with the possible exception of streaming, is the phenomenon of delayed hits. A major, widely recognized hit often leads to significant metrics for subsequent releases rather than for the hit itself. Take, for example, "Gangnam Style": PSY's sixth album, which included the hit song, experienced only a modest increase in sales compared to his previous albums. However, the singles released after "Gangnam Style" became massive hits. By the time PSY released his seventh album three years later, it garnered one of his lowest sales figures. The main benefit of "Gangnam Style" was the heightened anticipation for his subsequent release, the single "Gentleman," which predictably generated substantial sales and music video streams, despite its streaming metrics being somewhat lackluster.

With idol groups, this phenomenon is even more pronounced. A significant hit on one release consistently leads to increased sales on the next release. However, the pressure remains for the artist to maintain a string of hits to sustain high sales figures. When an inevitably sub-optimal release occurs, the subsequent release typically experiences lower adoption rates. As a result, the first inferior release after successive big hits, despite its inferior quality, may still have strong sales because fans pre-ordered it with expectations of another hit. However, subsequent releases, even if of higher quality, often see reduced adoption rates.

Each release represents its own narrative, and sales typically yield a deferred benefit for about a year or so. However, no single song or even period can guarantee a huge success. There are numerous artists who experienced two or even three major hits, only to be followed by multiple flops that ultimately ended their careers. If there's one factor that can aid in predicting success, it's the stability and consistency of releases. Maintaining consistency allows fans to anticipate that the next release will be "good," but not necessarily "great" or "bad," which tends to yield more reliable results than taking a gamble with each release.

The unfortunate

No article about Korean popular metrics, and maybe popular metrics at large, shouldn't come without the unfortunate warning: they can be all manipulated one way or the other. The bigger the fanclub (past success), the highest degree of influence it plays in boosting all metrics, and therefore, keeping an artist afloat. In Korea, for instance, the fact that small groups could coordinate in the small hours (0am ~ 6am) so that their favourite artist could get a #1 on realtime charts eventually made most realtime charts not include any milestone achieved during those periods. The same can be told about almost any metric: if enough fans (and sometimes money or influence) exists, there is always a way to boost the charts. For that reason, many raw data such as realtime streaming, video views or even sales are not really a great metric at all.

In conclusion

  • YouTube viewership serves as a poor predictor of success and typically only contributes to deferred attention for the next release. The fact that way over 50% of current artists will use ad-view to promote (and gain views) just makes this metric completelly useless.
  • Streaming rates are largely compartmentalized, with one release having little influence on the next.
  • Sales metrics are a consequence of success rather than a predictor, and they typically follow once an artist has amassed a large fanbase and achieved success. While not a great predictor, sales can be used to gauge whether an artist is maintaining their fanbase.
  • Music show awards are unreliable predictors due to their use of a variety of metrics, each with its own type of influence. Additionally, they are often heavily influenced by an artist's established fanbase, who may bulk-buy albums and vote in large numbers.

There is indeed no metric that can reliably predict whether an artist will release something successful in their next release, let alone in their future career. However, here are some ways to interpret and potentially benefit from these metrics:

  • YouTube: High organic viewership, if accompanied by other metrics, may indicate genuine success. However, if it's not followed by increased streaming, it suggests that the video, rather than the song itself, is attracting views. If both metrics are up, it might indicate true success, but ultimately, YouTube viewership mainly contributes to deferred attention for the next release.
  • Streaming: High streaming numbers, without accompanying metrics, indicate present success but may only have a minor deferred effect on fanbase and interest in the next release.
  • Sales: Sales metrics are not predictive of present or future performance; they are a consequence of previous success. Strong streaming numbers in the current release simply confirm stability and consistency.
  • Music show awards: These awards primarily indicate fanbase size and temporary success, particularly if accompanied by high streaming numbers. They hold higher value for idol groups, as they can lead to increased business opportunities and sponsorships/partnerships. However, they do not correlate with future quality or performance and may only contribute to deferred attention until the next release.

Predicting the behavior of a music video even requires advanced statistical analysis and remains inherently flawed as a global system, often necessitating artist-tailored multipliers for increased precision. Regarding other metrics, only the sales of idol groups can be somewhat predicted for the next release based on fanbase and current success. However, such predictions are typically short-lived and may not hold true for future releases.

In essence, the best we can achieve, with considerable effort, is to assess the current health of an artist's career and provide an educated guess for their immediate next release. However, accurately predicting long-term success is elusive.


Back to Article List
서투른 한국어에 대해 사과드립니다. 번역이나 수정을 돕고 싶다면 이메일을 보내주세요。
Ads by Google. ADs support our site when donations are down