Hello everyone. I'm running Wikidocs. I'd like to share my experience dealing with AI bots while operating the Wikidocs server and my thoughts on adopting Cloudflare. Wikidocs isn't huge, but it handles quite a bit of traffic. However, about three months ago, our traffic started skyrocketing abnormally. While that would normally be great news for regular visitors, unfortunately, most of it was AI learning bots trying to scrape Wikidocs' data. 1. The Merciless Onslaught of AI Bots These bots request things relentlessly. It's pretty much a DDoS attack. Connections from the US are over 10 times more frequent than from Korea. Having 10 times more US visitors on a Korean site is definitely not normal. While some bots are conscientious, others bombard us with over 100 requests simultaneously. We've had all sorts of battles with these bots recently. Initially, we tried mixing Nginx settings and Fail2ban, but we were helpless against bots that constantly changed IPs using VPNs. It seemed like we were losing. Rumor has it Wikidocs is a prime spot for data learning, as AI learning bots have proliferated in the past week, causing continuous traffic that the server struggled to handle. 2. Cloudflare: A Love-Hate Rekindling In the end, we brought in Cloudflare to put an end to this fight. Wikidocs had already used Cloudflare three years ago, but we disabled it due to speed issues and Google indexing problems. With Cloudflare, regardless of whether you use Free, Pro, or Business plans, you're connected to overseas servers, not Korean ones, which made speeds really slow. The Enterprise plan connects to Korean servers for fast access, but it's unbelievably expensive โ only suitable for large corporations. Plus, about three years ago, although the exact cause was unclear, Google's search bots had issues, and all our indexed Wikidocs pages vanished. This cut off Google search traffic, causing our main income source, AdSense revenue, to plummet. Despite these past anxieties, we had no choice but to use Cloudflare again. It was an unavoidable decision to prevent the server from crashing due to AI bot attacks.
1. ๋ฌด์๋นํ AI ๋ด๋ค์ ๊ณต์ต
We applied the 'Pro' version of Cloudflare. We didn't need the Business features; we just needed to set up DDoS protection and security rules to block the massive US traffic. After applying Cloudflare, the server hasn't gone down since. Cloudflare blocks some threats on its own, and for connections outside Korea, we used the 'Challenge' feature to make it difficult for bots to connect. We were careful not to include search bots, remembering the nightmare of losing our indexed Google pages back then. 3. Security Solved, But Speed is the Problem We solved a major issue, but another big one popped up: 'Speed.' Although Cloudflare has Korean regions, they refuse to connect non-Enterprise plans to them, citing high Korean bandwidth costs. So, to access Wikidocs via browser, you'd connect to servers in the US or other places with cheaper bandwidth instead of the conveniently located Korean servers, forcing us to endure agonizingly slow speeds. 4. The Savior: Argo Smart Routing, and the Cost Dilemma That's when we found a glimmer of hope: Cloudflare's Argo Smart Routing feature. Enabling this connects users to the closest Cloudflare region to their location. In other words, it connects Koreans to Korean servers.
2. ์ ์ฆ์ ํด๋ผ์ฐ๋ํ๋ ์ด ์ฌ๋์
This solved the speed issue, but a bigger concern emerged: traffic costs. Argo Smart Routing has a base fee of $5 per month, plus $0.1 per GB. On busy days, Wikidocs can generate around 100GB of traffic daily. Simple calculation: $10 per day, which is $300 (about 450,000 KRW) per month. Moreover, with bots running rampant in the AI era, more traffic will be generated, making Argo Smart Routing an overwhelming feature for a small site like Wikidocs. Plus, if Argo speeds things up, AI bots worldwide might access it even faster, turning it into a 'true hotspot.' 5. Domain Separation Strategy: wikidocs-cdn.net Amidst this irony, we devised a plan: move static and image files, which constitute the majority of Wikidocs' traffic, to Cloudflare's R2 for service and distribute the traffic. Initially, we tried connecting `static.wikidocs.net` to R2, but since 'Argo' charges by domain, it was considered the same traffic, leading to a cost explosion. So, we bought a new domain, `wikidocs-cdn.net`, and connected it to R2. By completely separating the domain, 'Argo' wouldn't charge for `wikidocs-cdn.net` traffic. However, another problem arose. It wasn't a huge issue, but it was quite annoying. When we served static files and images via R2 using `wikidocs-cdn.net`, the old Cloudflare speed issue resurfaced. While you can apply caching to static files, it wasn't a perfect solution. We became dissatisfied with the speed for various reasons. 6. Final Solution: Building a Hybrid CDN So, we stopped using R2 services and reconnected `wikidocs-cdn.net` to the Wikidocs server, creating our own CDN-like system using Nginx to serve only static files and images. Finally, we have a site with satisfactory speed and effective AI bot handling. However, this incurred costs for Cloudflare Pro, Argo Smart Routing, and the additional domain purchase. We reduced Argo charges by separating static files and images, but we'll need to keep an eye on this. The AI era has brought a world where content sites must bear the cost of bot traffic. T_T For now, the immediate crisis is averted, so we need to nurture the service well to justify the security costs. Thanks for reading this long post.

์ํค๋ ์ค๋ 'ํ๋ก(Pro)' ๋ฒ์ ์ ์ ์ฉํ์ต๋๋ค. ๊ตณ์ด ๋น์ฆ๋์ค ๊ธฐ๋ฅ๊น์ง๋ ํ์ํ์ง ์์๊ณ , DDoS ๋ฐฉ์ด์ ๋ณด์ ๊ด๋ จ ๊ท์น๋ค์ ์ค์ ํ์ฌ ๋ฐฉ๋ํ ๋ฏธ๊ตญ ์ชฝ ํธ๋ํฝ๋ง ๋ง์์ฃผ๋ฉด ๋๊ธฐ ๋๋ฌธ์ ๋๋ค. ํด๋ผ์ฐ๋ํ๋ ์ด๋ฅผ ์ ์ฉํ ํ๋ก ์๋ฒ๊ฐ ์ฃฝ๋ ์ผ์ ์์ด์ก์ต๋๋ค. ํด๋ผ์ฐ๋ํ๋ ์ด๊ฐ ์์ฒด์ ์ผ๋ก ๋ง์์ฃผ๋ ๊ฒ ์๊ณ , ๋ ํ๊ตญ ์ด์ธ์์ ์ ์ํ ๊ฒฝ์ฐ์๋ '์ฑ๋ฆฐ์ง(Challenge)' ๊ธฐ๋ฅ์ ์ฌ์ฉํ์ฌ ๋ด๋ค์ ์ ์์ ์ด๋ ต๊ฒ ํ์ต๋๋ค. ๊ตฌ๊ธ ์ธ๋ฑ์ฑ ํ์ด์ง๊ฐ ์์ด์ก๋ ๊ณผ๊ฑฐ์ ์ ๋ชฝ์ด ์๊ธฐ ๋๋ฌธ์ ์์น ๋ด๋ค์ ์ฌ๊ธฐ์ ํด๋น๋์ง ์๊ฒ ๊ผผ๊ผผํ๊ฒ ์์ ํ์ต๋๋ค.
3. ๋ณด์์ ํด๊ฒฐํ์ง๋ง, ์๋๊ฐ ๋ฌธ์
ํฐ ๋ฌธ์ ๊ฐ ํด๊ฒฐ๋์์ง๋ง ๋ ๋ค๋ฅธ ํฐ ๋ฌธ์ ๊ฐ ์๊ฒผ์ต๋๋ค. ๋ฐ๋ก '์๋' ๋ฌธ์ ์ ๋๋ค. ํด๋ผ์ฐ๋ํ๋ ์ด ํ๊ตญ ๋ฆฌ์ ์ด ์์ง๋ง ํ๊ตญ์ ๋ง ์ฌ์ฉ๋ฃ๊ฐ ๋น์ธ๋ค๋ ์ด์ ๋ก ์ํฐํ๋ผ์ด์ฆ ๊ธ์ด ์๋ ์ด์ ์ ๋๋ก ํ๊ตญ ๋ฆฌ์ ์ผ๋ก ๋ถ์ฌ์ฃผ์ง ์๋๊ตฐ์. ๊ทธ๋์ ๋ธ๋ผ์ฐ์ ๋ก ์ํค๋ ์ค์ ์ ์ํ๋ ค๋ฉด ๋ฉ์ฉกํ๊ฒ ๊ฐ๊น์ด ์๋ ํ๊ตญ ๋ฆฌ์ ๋์ ๋ฏธ๊ตญ์ด๋ ๋ง ์ด์ฉ๋ฃ๊ฐ ์ผ ๊ณณ์ผ๋ก ์๋ค ๊ฐ๋ค ํ๊ฒ ๋ง๋ค์ด, ์ฒ์ฐธํ ์๋๋ฅผ ๊ฐ์ํด์ผ๋ง ํ์ต๋๋ค.
4. ๊ตฌ์ํฌ์ Argo Smart Routing, ๊ทธ๋ฆฌ๊ณ ๋น์ฉ์ ๋๋ ๋ง
๊ทธ๋ ํ ์ค๊ธฐ ํฌ๋ง์ ๋ฐ๊ฒฌํ์ต๋๋ค. ๋ฐ๋ก ํด๋ผ์ฐ๋ํ๋ ์ด์ Argo Smart Routing ๊ธฐ๋ฅ์ ๋๋ค. ์ด ๊ธฐ๋ฅ์ ์ผ๋ฉด ์ ์์๊ฐ ์์นํ ๊ฐ์ฅ ๊ฐ๊น์ด ํด๋ผ์ฐ๋ํ๋ ์ด ๋ฆฌ์ ์ผ๋ก ์ฐ๊ฒฐํด ์ค๋๋ค. ์ฆ, ํ๊ตญ ์ฌ๋์ ํ๊ตญ ๋ฆฌ์ ์ผ๋ก ์ฐ๊ฒฐํด ์ฃผ๋ ๊ฒ์ ๋๋ค.

์ด๊ฑธ๋ก ์๋ ๋ฌธ์ ๋ ํด๊ฒฐ๋์์ง๋ง, ๋ ํฐ ๊ณ ๋ฏผ์ด ์ฐพ์์์ต๋๋ค. ๊ทธ๊ฒ์ ๋ฐ๋ก ํธ๋ํฝ ๋น์ฉ์ ๋๋ค. "Argo Smart Routing"์ ์ 5๋ฌ๋ฌ ๊ธฐ๋ณธ์, 1GB๋น 0.1๋ฌ๋ฌ์ฉ ๊ณผ๊ธ์ด ๋ฉ๋๋ค. ์ํค๋ ์ค๊ฐ ๋ฐ์ ๋ ์๋ ํ๋ฃจ์ 100GB ์ ๋ ํธ๋ํฝ์ด ๋ฐ์ํ๊ธฐ๋ ํฉ๋๋ค. ๋จ์ ๊ณ์ฐ์ผ๋ก ํ๋ฃจ์ 10๋ฌ๋ฌ, ํ ๋ฌ์ด๋ฉด 300๋ฌ๋ฌ(์ฝ 45๋ง ์)๊ฐ ๋์ต๋๋ค. ๋๊ตฌ๋ AI ์๋์ ๋ด๋ค์ด ํ๊ฐ ์น๋ฉด ๋ ๋ง์ ํธ๋ํฝ์ด ๋ฐ์ํ๊ฒ ๋ ๊ฒ์ด๊ณ , "Argo Smart Routing"์ ์ํค๋ ์ค ๊ฐ์ ์์ ๊ท๋ชจ์ ์ฌ์ดํธ๊ฐ ๊ฐ๋ดํ๊ธฐ์๋ ์ ๋ง ๋ฒ ์ฐฌ ๊ธฐ๋ฅ์ธ ๊ฒ์ ๋๋ค. ๊ฒ๋ค๊ฐ Argo ๋๋ถ์ ์๋๊ฐ ๋นจ๋ผ์ง๋ฉด ์ ์ธ๊ณ์ ๋ถํฌํ AI ๋ด๋ค์ด ๋์ฑ ๋น ๋ฅธ ์๋๋ก ์ ์ํ๊ฒ ๋์ด '์ง์ ํ ๋ง์ง'์ด ๋ ์๋ ์๊ณ ์.
5. ๋๋ฉ์ธ ๋ถ๋ฆฌ ์์ : wikidocs-cdn.net
์ด๋ฌํ ์์ด๋ฌ๋ ์์์ ๊ณ ๋ฏผํ๋ ์ค, ์ํค๋ ์ค์ ์ฃผ์ ํธ๋ํฝ์ธ ์ ์ ํ์ผ๊ณผ ์ด๋ฏธ์ง ํ์ผ๋ค์ ํด๋ผ์ฐ๋ํ๋ ์ด์ R2๋ก ์ฎ๊ฒจ ์๋น์คํ์ฌ ํธ๋ํฝ์ ๋ถ๋ดํ๋ ์์ ์ ์ธ์ฐ๊ฒ ๋์์ต๋๋ค. ์ฒ์์ `static.wikidocs.net`์ ์ฌ์ฉํ์ฌ R2์ ์ฐ๊ฒฐ์ํค๋ ค๊ณ ํ๋๋ฐ, "Argo"๋ ๋๋ฉ์ธ ๋จ์์ ๊ณผ๊ธ์ด์ด์ ์ด ๋ํ ๋์ผ ํธ๋ํฝ์ผ๋ก ๊ฐ์ฃผ๋์ด ์๊ธ ํญํ์ ๋ง๊ฒ ๋ฉ๋๋ค. ๊ทธ๋์ `wikidocs-cdn.net`์ด๋ผ๋ ๋๋ฉ์ธ์ ์๋ก ๊ตฌ์ ํ์ฌ R2์ ์ฐ๊ฒฐํ์ต๋๋ค. ์ด๋ ๊ฒ ์์ ๋๋ฉ์ธ์ ๋ถ๋ฆฌํด ๋ฒ๋ฆฌ๋ฉด "Argo"๋ `wikidocs-cdn.net` ํธ๋ํฝ์ ๋ํด์๋ ๊ณผ๊ธํ์ง ์๊ฒ ๋๋๊น์.

ํ์ง๋ง ๋ ๋ฌธ์ ๊ฐ ์๊ฒผ์ต๋๋ค. ์ด๋ฒ์๋ ์์ฃผ ํฐ ๋ฌธ์ ๋ ์๋์ง๋ง ๊ฝค ์ ๊ฒฝ ์ฐ์ผ๋งํ ๋ฌธ์ ์์ต๋๋ค. ์ ์ ํ์ผ๊ณผ ์ด๋ฏธ์ง๋ค์ R2์ ์ฐ๊ฒฐํ์ฌ `wikidocs-cdn.net`์ผ๋ก ์๋น์คํ๋๋, ํด๋ผ์ฐ๋ํ๋ ์ด์ ๊ณ ์ง ๋ฌธ์ ์ธ ์๋ ๋ฌธ์ ๊ฐ ๋ ๋ฐ์ํ ๊ฒ์ ๋๋ค. ๋ฌผ๋ก ์ ์ ํ์ผ๋ค์ ์บ์๋ฅผ ์ ์ฉํ ์๋ ์์ง๋ง, ์๋ฒฝํ ํด๊ฒฐ์ฑ ์ด ์๋์์ต๋๋ค. ์ฌ๋ฌ ๊ฐ์ง ์ด์ ๋ก ์๋ ๋ฌธ์ ์ ๋ง์กฑํ์ง ๋ชปํ๊ฒ ๋ ์ํฉ์ด ๋ ๊ฒ์ด์ฃ .
6. ์ต์ข ํด๊ฒฐ: ํ์ด๋ธ๋ฆฌ๋ ๋ฐฉ์์ CDN ๊ตฌ์ถ
๊ทธ๋์ R2 ์๋น์ค๋ฅผ ์ฐ์ง ์๊ณ , `wikidocs-cdn.net`์ ๋ค์ ์ํค๋ ์ค ์๋ฒ์ ๋ฌผ๋ ค Nginx๋ก ์ ์ ํ์ผ๊ณผ ์ด๋ฏธ์ง๋ง ์๋น์คํ๋ ์์ฒด CDN ๋น์ทํ๊ฒ ๋ง๋ค์์ต๋๋ค. ๋๋์ด, ์๋๋ ๋ง์กฑ์ค๋ฝ๊ณ AI ๋ด๋ค๋ ์ ์ ํ ์ฒ๋ฆฌํ ์ ์๋ ์ฌ์ดํธ๊ฐ ๋์์ต๋๋ค. ๋ค๋ง ํด๋ผ์ฐ๋ํ๋ ์ด ํ๋ก, "Argo Smart Routing", ์ถ๊ฐ ๋๋ฉ์ธ ๊ตฌ์ ๋ฑ์ ๋น์ฉ์ด ๋ฐ์ํ๊ฒ ๋์์ต๋๋ค. ์ ์ ํ์ผ๊ณผ ์ด๋ฏธ์ง๋ค์ ๋ถ๋ฆฌํ์ฌ Argo ๊ณผ๊ธ์ ์ค์์ง๋ง, ์ด ๋ถ๋ถ์ ์ง์์ ์ผ๋ก ๊ด์ฐฐํด์ผ ํ ๊ฒ ๊ฐ์ต๋๋ค. AI ์๋๊ฐ ๋๋ฉด์ ์ฝํ ์ธ ์ ๊ณตํ๋ ์ฌ์ดํธ๋ค์ ๋ด ํธ๋ํฝ ๋น์ฉ๊น์ง ๊ฐ๋นํด์ผ ํ๋ ์ธ์์ด ์๋ค์. ใ ใ ์ผ๋จ ๊ธํ ๋ถ์ ๊ป์ผ๋, ๋ณด์ ๋น์ฉ ์๊น์ง ์๊ฒ ์๋น์ค๋ฅผ ์ ๊ฐ๊พธ์ด ๋๊ฐ์ผ๊ฒ ์ต๋๋ค. ๊ธด ๊ธ ์ฝ์ด์ฃผ์ ์ ๊ฐ์ฌํฉ๋๋ค.ย
"Users are super grateful for Wikidocs and the operator's detailed battle with AI bots, with some spicy theories thrown in!"
#FunNo comments yet.