Doing QA in fintech, e-commerce, or government projects means generating volumes of test data — but using real personal data is a violation of PDPA / GDPR. This article covers why fake data is required, the Taiwan official checksum algorithms, and real QA scenarios where this data matters.
Why you must use fake data (compliance)
Taiwan's Personal Data Protection Act treats national ID numbers as sensitive personal data. Internal dev/test environments using real IDs (even employees') flunk audits.
Real-world consequences:
- Fintech QA: KYC flows need thousands of IDs — they have to be format-valid but anonymous
- E-commerce: load testing the member system with 10K fake accounts that pass format validation
- Government / health API integration: sandbox environments need format-correct test data
- Risk-control rule testing: simulate users of different ages / genders / regions
The point: fake data must be format-valid (passes validators) but fictional (no real person it corresponds to).
Taiwan national ID algorithm (official)
10 characters: [A-Z] + [12] + 8 digits.
Steps:
- Letter → number: A=10, B=11, C=12 … I=34, O=35, X=30, Y=31, Z=33 (I and O are out of order for historical reasons)
- Split the letter's number into two digits
N1 N2 - Weighted sum:
N1 × 1 + N2 × 9 + d1 × 8 + d2 × 7 + d3 × 6 + d4 × 5 + d5 × 4 + d6 × 3 + d7 × 2 + d8 × 1 - Checksum =
(10 - sum mod 10) mod 10 - The second character: 1 = male, 2 = female (new residents use 8, 9 — this tool doesn't generate those)
Example: A123456789 — A → 10, second-char 1 × 9 = 9, then weighted-sum digits 12345678 with checksum 9 — the total mod 10 must be 0.
Business number (8 digits) — the official rule
8 digits, weights [1, 2, 1, 2, 1, 2, 4, 1].
Algorithm:
- Multiply each digit by its weight
- If a product ≥ 10, add the ones and tens digits (e.g. 7 × 2 = 14 → 1 + 4 = 5)
- Sum all results
- If sum mod 10 == 0 → valid
- Special case: if the 7th digit is 7, then (sum + 1) mod 10 == 0 is also valid
This special 7th-digit-7 rule is a historical National Tax Bureau exception to avoid number-collision. Forgetting it in your validator wrongly rejects valid business numbers.
The TW test data tool matches the official algorithm so the output passes most validators.
Luhn credit card algorithm (universal)
Luhn is the global standard for credit-card check digits. Used by Visa, Mastercard, JCB, Amex — and also IMEI numbers, Canadian Social Insurance Numbers, and more.
Steps (from right to left):
- Rightmost digit is the check digit — set aside
- Starting from the second-from-right, double every other digit
- If doubling gives > 9, subtract 9 (equivalent to summing the tens and ones digits)
- Sum all digits (including the check digit)
- Total mod 10 == 0 → valid
Example: 4532015112830366
Stripe test cards: 4242424242424242 (Visa), 5555555555554444 (Mastercard) — all pass Luhn but are well-known test numbers Stripe never actually charges.
Real QA scenarios
How I actually use this in fintech QA:
- Account opening load tests: generate 10K valid IDs + business numbers + credit cards, drive the open-account API with JMeter / k6 → percentile analysis
- Risk-rule regression: use IDs with different gender / region letters to verify rules don't misfire
- Foreign-resident onboarding: test the new-resident ID variant (second digit != 1/2) flows
- Credit card binding: test BIN-routing logic by feeding Visa / Mastercard / JCB cards of different brand prefixes
- DB seed: when the test environment boots, seed 100 fake accounts so QA and PM can demo immediately
Important: data from the tw-test-data tool is for format testing only — using it to open real accounts or register for real services constitutes fraud.
Try it now: generate 100 IDs in the TW test data tool, feed them into your company's validator, and confirm 100% pass.