🧐 Why Build This? / 为什么要造这个轮子?

Moving to a new country involves a lot of paperwork. In the Czech Republic, waiting for your visa or residency permit decision can be nerve-wracking. The Ministry of Interior (MOI) publishes status updates in Excel/PDF files or via a web portal, but checking it manually every day is a recipe for anxiety.
移居到一个新国家需要处理大量的文件。在捷克,等待签证或居留许可的决定往往让人心力交瘁。内政部 (MOI) 通过 Excel/PDF 文件或门户网站发布状态更新,但每天手动检查简直是在制造焦虑。

🧐 The State of Existing Tools / 现有的工具与其局限

Before building this, I surveyed the open-source landscape. Most existing solutions fall into two categories:
在构建此系统之前,我调研了开源社区。大多数现有方案主要分为两类:

  1. Outdated Excel Parsers: Projects like jonathas/czech-visa-check were designed for the old era where MOI published status updates in Excel sheets. They are no longer functional for the modern web portal.
    过时的 Excel 解析器:像 jonathas/czech-visa-check 这样的项目是为旧时代设计的(当时内政部通过 Excel 发布更新)。它们已无法适应现代化的 Web 门户。
  2. Simple Cron Scripts: Tools like sfabrizio/auto-check-cz-visa offer basic monitoring but often lack robustness. They typically run as simple cron jobs on a local machine, prone to failure if the network jitters or the IP gets rate-limited.
    简单的 Cron 脚本:像 sfabrizio/auto-check-cz-visa 这样的工具提供了基础监控,但往往缺乏健壮性。它们通常作为本地机器上的简单定时任务运行,容易因网络抖动或 IP 速率限制而失效。

Why “Reinvent the Wheel”? / 为什么要“重复造轮子”?

I didn’t just want a script that works; I wanted a system that survives.
我不仅仅想要一个能用的脚本;我想要一个能够长期存活的系统。

  • Resilience (韧性): Unlike simple requests based scrapers, this project uses Playwright to handle complex frontend rendering and anti-bot measures.
  • Scale (规模): It’s not just for me. The multi-user architecture allows a single server to monitor statuses for an entire friend group or community.
  • Observability (可观测性): It distinguishes between “Site Down” and “Application Updates”, preventing panic-inducing false alarms.

Thus, Czech-Visa-Application-Status-Check was engineered as an SRE-grade solution.


⚙️ Configuration Deep Dive / 配置详解

The system is highly configurable via environment variables. Here is a production-ready example:
该系统通过环境变量高度可配置。以下是一个生产就绪的配置示例:

# Core Settings
Check_Interval=3600 # Check every hour / 每小时检查一次
HEADLESS=true # Run browser in background / 后台运行浏览器

# Notification Channels (Optional)
# Email via SMTP
SMTP_SERVER=smtp.gmail.com
SMTP_PORT=587
SMTP_USERNAME=me@gmail.com
SMTP_PASSWORD=app-password

# Telegram Bot
TELEGRAM_BOT_TOKEN=123456:ABC-DEF...
TELEGRAM_CHAT_ID=-100123456789

🏗️ SRE-Grade Architecture / SRE 级别的架构设计

This isn’t just a simple scraping script. It’s engineered with Site Reliability Engineering (SRE) principles in mind.
这不仅仅是一个简单的爬虫脚本。它是基于 SRE(网站可靠性工程) 原则设计的。

1. Robust Scraper (Playwright) / 健壮的爬虫 (Playwright)

Instead of simple HTTP requests which are easily blocked or broken by JavaScript rendering, I used Playwright.

  • Resource Blocking: Intelligently blocks images, fonts, and CSS to minimize bandwidth usage and speed up queries.
  • Auto-Recovery: If the browser crashes, the “Context Manager” ensures the process is cleaned up, preventing memory leaks.
    为了防止被屏蔽或因 JavaScript 渲染而失效,我使用了 Playwright 而非简单的 HTTP 请求。
  • 资源屏蔽:智能屏蔽图片、字体和 CSS,以最小化带宽占用并加速查询。
  • 自动恢复:如果浏览器崩溃,“上下文管理器”确保进程被清理,防止内存泄漏。

2. Atomic Writes (Data Safety) / 原子写入 (数据安全)

How to prevent data corruption if power fails during a write? Atomic Writes.
如果在写入过程中断电,如何防止数据损坏?原子写入

# Simplified Logic
with tempfile.NamedTemporaryFile('w', delete=False) as tmp:
json.dump(data, tmp)
tmp.flush()
os.fsync(tmp.fileno()) # Force write to disk
os.replace(tmp.name, "status.json") # Atomic swap

The system never overwrites the live database directly. It writes to a temp file first, then instantaneously swaps it. Zero chance of a half-written file.
系统从不直接覆盖实时数据库。它先写入临时文件,然后瞬间替换。绝无出现写了一半的文件的可能。

3. LKVS (Smart Notifications) / LKVS (智能通知)

“Query Failed” is not a status update. It’s noise.
“查询失败”不是状态更新,它是噪音。

I implemented the LKVS (Last Known Valid Status) mechanism:

  • If the query fails due to network issues, silence.
  • When it recovers, compare the new status with the last valid status (not the error state).
  • Only send notifications for real changes (e.g., “In Process” -> “Approved”).
    我实施了 LKVS(最后已知有效状态) 机制:
  • 如果因网络问题查询失败,保持静默
  • 当恢复时,将新状态与最后有效状态(而非错误状态)进行比较。
  • 仅针对真实变更(如“处理中” -> “已批准”)发送通知。

🛠️ Deployment Guide / 部署指南

The easiest way to run it anywhere (NAS, VPS, Raspberry Pi).
在任何地方(NAS, VPS, 树莓派)运行的最简单方法。

# 1. Clone repo
git clone https://github.com/yuanweize/Czech-Visa-Application-Status-Check.git
cd Czech-Visa-Application-Status-Check

# 2. Config
cp .env.example .env
vi .env # Add your email settings

# 3. Launch
docker-compose up -d

Option 2: Bare Metal / 方案二:裸机运行

If you prefer running directly on Python.
如果你更喜欢直接在 Python 环境运行。

pip install -r requirements.txt
playwright install chromium
python visa_status.py monitor

🌟 Support & Collaborate / 支持与协作

This project is open source under MIT License.

If this tool saved you from refreshing the page 100 times a day, please Star 🌟 the repo!
本项目基于 MIT 协议开源。
如果这个工具把你从每天刷新 100 次页面的地狱中解救出来,请给仓库点个 Star 🌟!


GitHub Stars