Soridata - Data handling Transparency


Soridata is an independent data aggregation personal project that compiles information from various public and official sources within the Korean Music industry, including but not limited to CircleChart, Melon, Music show broadcasts, Namu.wiki, Wikipedia, Official social media account and official artist profiles. This site is designed to organize verified, public metrics into an accessible format for a global audience.

Affiliation and Accuracy - Soridata is not affiliated with, endorsed by, or partnered with any entertainment agencies, music streaming services, or television networks, or any company whatsoever. While we strive to maintain the highest level of accuracy and provide a mechanism for community-requested updates, the data is provided "as is" and is subject to mistakes.

Source Material and Disputes - All metrics and artist information are derived from data made publicly available by original official sources. Consequently, any discrepancies, errors, or complaints regarding the data itself should be directed to the respective original source (e.g., the specific streaming platform or chart provider). Soridata acts solely as a secondary information provider and does not possess the authority to alter official industry records or certifications. Most data is automatically retrieved and maintained from the official sources and are therefore subject to original source issues and mistakes.

Specifics - Bellow is a list of brief explanation on how some information or aggregation is calculated.


Table of Contents


Youtube Music Videos

TL;DR: We update views and likes every day for videos released in the last 3 years, and every other day for others. Videos that are offline are marked as dead and no longer updated, but we check if they came back online every month (automatically).

Every 10 minutes, our server automatically checks a number of videos using Youtube API, directly from the source.

The data retrieved is global views/likes, and not just Korean views.

Database Video types

TL;DR: Main videos are the Music Video with most views (there can be exact duplicates or different Music Video versions). Alternate videos are videos like Dance practice, Performance, Live acoustics and so on. Audio-Videos are Music-only (most of them available on Youtube Music). When checking stats for a video, "Main views" are for all Main and Duplicates videos, while Total also include every other type.

Videos can be a "main" video (meaning the main MV or only version of a song), a "duplicate" video (exact copy of the "main" video), or an "alternate" video, which is usually a different version/video type of the same song (dance versions, acoustic versions, etc). These are filled out by collaborators at time of registration but can be changed any time.

Upon registration, collaborators can also choose if a video is a Music Video or an Audio Video (usually B-sides). Audio Videos do not have an actual video but only audio, although some might have a simple visualization or lyrics showing.

Main music videos will be automatically switched with its duplicate video if one of two conditions occur: The 'main' video is removed from Youtube (becomes a 'dead' video, which is kept only as an historical/statistical data), or if the 'duplicate' video garners more views than the 'main' video, thus becoming the new 'main'.

While Music Videos that are offline are kept on the database only marked as 'dead' and stop being updated, Audio Videos are automatically deleted from the database if they go offline. A video must fail two consecutive days to be marked 'dead' or deleted.

Once a month (on the 1st day of each month), all Music Videos that are marked as 'dead' are re-checked on Youtube API so in the event the video was just temporarily private/blocked and is now back online, the video will be "revived".

CircleChart Physical Sales

TL;DR: Circle releases this chart on different days every month, we monitor it every day and will update when it is released. This is the Circle Album Chart, not the Circle Retail Album Sales (introduced in 2021, works more like Hanteo). We also monitor and collate the Yearly totals released by Circle on January. For more on the differences between Circle and Hanteo, check our glossary

CircleChart releases two Physical Sales data. One is the original system in which data of distribution minus returns is reported since the old days of MIAK (The chart name before 2010, which was then called Gaon until 2022). The other, introduced in 2021, is Retail Sales, which is more similar to Hanteo charts, where direct retail sales are reported. However, CircleChart network of stores is smaller and therefore less accurate then both Hanteo and their original method. Since historically only the Distribution data is released, that is the one being used on the site. Due to the need to compensate returns as well their internal audit, their charts are usually delayed - their Monthly charts are usually released over a week past each month, while their yearly audit can take 2 weeks to be released on mid-January.

Circle doesn't have a centralized organization, which means artists do not have a unique ID and each time an artist charts on any of their charts, it can (and often does) be spelled differently. We have complex algorithms in place to match every artist they mention to our unique ID, we have been going through all data (since 2010) to double check everything has been matched correctly, we currently checked 2010, 2011, and 2023 onwards.

Physical Sales are stored in two methods: For each item (album), sales data are stored per month and per year. Then a totalized sales for the artist is also stored per month and per year.

The need to have the year sales stored on top of the monthly sales is that CircleChart applies corrections on their yearly charts based on returns. Also, since sales that do not enter the top 100 monthly are not detected, some of these sales might be detected on the yearly total of the top 100 yearly sales. When you check item or artist data per month, you will see the monthly data (there is no way to apply the corrected yearly data on the monthly data). When checking the yearly sales, the data shown will either be the yearly corrected value if it was available on the yearly top 100, or the sum of the monthly tables.

It is important to consider that with all things included, the totals are the minimum sales, since small sales are not seen on either monthly or yearly sales. Most artists have some tiny amount of sales for months or even years after release, but there is no way to record those unless CircleChart one day creates an artist totals stats.

If you are looking for complete sales data, always keep in mind there are two sources for Korean data: Circle (officially linked to the government and audited), and Hanteo (from a commercial organization and not audited). For more on the differences between Circle and Hanteo, check out glossary

CircleChart Digital Streaming, Billboard Chart

TL;DR: Circle releases this chart on Wednesdays. This is the Circle Digital Chart (which includes downloads), not the Circle Streaming Chart, because we believe its a better Digital benchmark than just streams. Billboard releases on Monday and is fetched on Tuesday, this is the top 200 Global chart.

CircleChart historically releases digital streaming data, but their method to calculate their digital streaming data have changed several times, and in some years they did not supply a number of streams (or scores), only the rank. Since we would only be able to have the order of the chart, and that their score formula is unstable across the years, we calculate a score based on chart position. Also, extremelly important to notice, is that we use their Digital chart instead of Streaming chart. The Digital scores is more complete because it includes not only streaming, but also digital sales, ringtone sales and BGM sales (all digital goods). Like with Circle streaming scores, this has changed over the years and for the same reason we still have to relly on one centralized score system of our own (see below).

As usual with CircleChart releases, this chart is delayed a few days after the end of the week. It is usually released on the following Wednewsday. The system automatically detects a new week and retrieves it, using the same method mentioned above on Physical Sales to correlate the artist name with the database name. The system also uses the Song name to try and lock with one of the artist's MVs.

The weekly data is stored in two arrays: One stores the actual chart, with the original artist and song name, and the resolved artist and MV id's. The second array stores artists by total score per week using the following formula to calculate the score:

  • First place gets 200 points
  • Second place gets 150 points
  • Third place onwards gets 100-position, thus, 97 to 1 points.

Each artist have a total score based on the sum of all songs that charted.

Since late 2023, we have been going through all the streaming weekly since 2010 to make sure all artist and MV links are correct. For the most part, all of 2023 onwards have been verified as it is released.

For Billboard Chart, since it includes 200 entries, the score is:

  • First place gets 400 points
  • Second place gets 300 points
  • Third place onwards gets 200-position, thus, 197 to 1 points.

Contrary to the Circle Chart, only the Artist is resolved on Billboard because the naming can be different and would require a whole new matching system. Besides, we are only interested in when an artist charts, not each song, for Billboard.

Artist Trending System

TL;DR: This system measures which artists have the highest score (trend) in the last couple of weeks. It uses Circle Digital (weekly), Awards, Sales (monthly), Youtube Likes and Billboard Chart to rank them every day.

Data starting from the previous week Monday is considered, and given the following weight:

  • Circle Digital score (uses most recent week): 30%
  • Music Show Awards (all since previous week's Monday): 25%
  • Youtube Likes on Main videos: 20%
  • Billboard Global Chart score: 15%
  • Circle Physical Sales (uses most recent month): 10%

For Circle Digital, Circle Sales and Billboard Global Chart, the most recent data available is used; For Youtube video likes, all released since the previous week Monday onwards is used (only for Main videos); For Awards, all awards since the previous week Monday onwards is used (this, since The Show of last week).

Each metric is ranked in a percentile. The best of each rank will receive full score for that metric (for instance, #1 on Circle Digital gets the full 30% points for Digital). For likes, all likes on all videos are counted and the highest score will be for the Artist with most likes. For Awards, all awards are counted and full score is given for the artist with most awards (so if the Artist with most awards has 6 awards, that will be a full score).

This score is calculated once a day.

Time to Success

The "Time to Success" shown in artists is calculated based on how long the artist took from their debut date to the first occurrence of one of the following, whichever comes first:

  • Music Show Award
  • PAK
  • An item being sold 100.000 units
  • #1 song in CircleChart Weekly

"Time to Success" is not shared among the members of a group: Each member have their own Time to Success based only on their solo activities. Note that artists successful before 2010 might not show because most metrics used are limited to CircleChart (2010+) and Youtube (2008+), Music Show Awards are also tracked only starting 2007.

Mainstream Level

TL;DR: The database has too many artists, so some of us prefer to only filter the more mainstream ones. This details how we calculate who is mainstream based on Streaming. While Circle only have data since 2010, artists with no releases/streams since 2010 are probably no longer mainstream.

Mainstream Level is calculated to allow users only interested in the Mainstream artists to have a lighter site content. Approximately 50% of content is hidden when the maximum filter is active.

The Mainstream level can range from 1 (highest) to 3 (lowest) and is calculated as follows:

  • Rank 1: Total Circle Score of 7000 or more.
  • Rank 2: Total Circle Score of 1000 or more.
  • Rank 3: Artists that do not fit any of the above.

After that, a few increases are performed based on sales:

  • Artists ranked level 2 but with over 150.000 units sold are upgraded to level 1
  • Artists ranked level 3 but with over 15.000 units sold are upgraded to level 2

Another extra pass increase are performed based on total views (all MVs together):

  • Artists ranked level 2 but with over 200M views are upgraded to level 1
  • Artists ranked level 3 but with over 100M views are upgraded to level 2

Sub-units, soloists and collabs have also special level exceptions increase:

  • Sub-units will have their level upgraded to the same as the main group if lower
  • Soloists have their level upgraded to the same as the main group if lower
  • Collabs will have their level upgraded to the same as the highest level member if the collab level was lower.

Artist Success (Stars) System

Artists have a Success Rating (Stars) assigned to them, from zero () to 5 () stars.

This system is rather simple to avoid any polemics. Each star represents one of the following:

  • Sales up to 500.000
  • Streaming (score) up to 700
  • Awards up to 5
  • Organic Views up to 250.000.000
  • Yearly awards up to 7

The threshold values are based on a subjective cut-off for the top ~150 artists. For instance, the top ~150 artists have 500.000 sales, ~150 artists have a streaming score above 700 and so on. These values are subject to change. Each star can be filled in percentages, so an artist with 250.000 sales and 3500 streaming would have 0.5 stars on sales and 0.5 stars on streaming, with will add up to 1 star. Artists with good metrics prior to 2010 might be under-evaluated since most data available is digital limited to CircleChart (2010+) and Youtube (2008+), Music Show Awards are also tracked only starting 2007.

Server Costs, Ads and Donations

In April 2023, I have added a Buy me a Coffee Donation page and a Paypal Donation page to try and coup the server costs. These are the data for each year.
I ask for a personal $5 of "support" per month as an incentive to keep the site online.

Year Hosting/Domain Costs DonationsAds
2023 $ 200 ($15/mo) $ 204-
2024 $ 144 ($18/mo) up to August
$ 139 ($39/mo = $32 host + $2 domain + $5 support) September onwards

TOTAL: $ 283
$ 283-
2025 $ 78 ($39/mo = $32 host + $2 domain + $5 support) up to end of february
$ 108 ($18/mo = $9 host + $4 domain + $5 support) up to mid september
$ 133 ($33/mo = $25 host + $4 domain + $5 support) mid sep ~ end of year

TOTAL: $ 319
$ 315$ 5
2026

$ 408 ($33/mo = $25 host + $4 domain + $5 support) +
$ 46 ($3.8/mo alternate host thanks to Soridata.com suspension)
= $37/mo 


TOTAL: $ 454

$ 109

Note:2025 is cheaper because we were on our first year in the new host, and the promotional price is less than half the normal price. 

Following fantards brigading our main host in January 2026 and successfully getting them to suspend it for a day (we were down for a week after I decided to remove the site, then get it back up changed), we had to separate the daisuki.com.br and soridata.com domains to prevent disruption to Soridata by idiots affecting daisuki.com.br, this means we now need to pay TWO host services instead of one. In case of another suspension by Hostinger, we can move to another host within a few hours, and we will compound any losses in the donation drive since you cannot blame me when hosts don't have morals and give in to morons. Most services are now in my home country so I can speak directly with them in case this happens, making it harder for them to fall for it.