Taiwan test data in depth: national ID algorithm, business-number 7th-digit rule, Luhn credit cards

Doing QA in fintech, e-commerce, or government projects means generating volumes of test data — but using real personal data is a violation of PDPA / GDPR. This article covers why fake data is required, the Taiwan official checksum algorithms, and real QA scenarios where this data matters.

Why you must use fake data (compliance)

Taiwan's Personal Data Protection Act treats national ID numbers as sensitive personal data. Internal dev/test environments using real IDs (even employees') flunk audits.

Real-world consequences:

Fintech QA: KYC flows need thousands of IDs — they have to be format-valid but anonymous
E-commerce: load testing the member system with 10K fake accounts that pass format validation
Government / health API integration: sandbox environments need format-correct test data
Risk-control rule testing: simulate users of different ages / genders / regions

The point: fake data must be format-valid (passes validators) but fictional (no real person it corresponds to).

Taiwan national ID algorithm (official)

10 characters: [A-Z] + [12] + 8 digits.

Steps:

Letter → number: A=10, B=11, C=12 … I=34, O=35, X=30, Y=31, Z=33 (I and O are out of order for historical reasons)
Split the letter's number into two digits N1 N2
Weighted sum: N1 × 1 + N2 × 9 + d1 × 8 + d2 × 7 + d3 × 6 + d4 × 5 + d5 × 4 + d6 × 3 + d7 × 2 + d8 × 1
Checksum = (10 - sum mod 10) mod 10
The second character: 1 = male, 2 = female (new residents use 8, 9 — this tool doesn't generate those)

Example: A123456789 — A → 10, second-char 1 × 9 = 9, then weighted-sum digits 12345678 with checksum 9 — the total mod 10 must be 0.

Business number (8 digits) — the official rule

8 digits, weights [1, 2, 1, 2, 1, 2, 4, 1].

Algorithm:

Multiply each digit by its weight
If a product ≥ 10, add the ones and tens digits (e.g. 7 × 2 = 14 → 1 + 4 = 5)
Sum all results
If sum mod 10 == 0 → valid
Special case: if the 7th digit is 7, then (sum + 1) mod 10 == 0 is also valid

This special 7th-digit-7 rule is a historical National Tax Bureau exception to avoid number-collision. Forgetting it in your validator wrongly rejects valid business numbers.

The TW test data tool matches the official algorithm so the output passes most validators.

Luhn credit card algorithm (universal)

Luhn is the global standard for credit-card check digits. Used by Visa, Mastercard, JCB, Amex — and also IMEI numbers, Canadian Social Insurance Numbers, and more.

Steps (from right to left):

Rightmost digit is the check digit — set aside
Starting from the second-from-right, double every other digit
If doubling gives > 9, subtract 9 (equivalent to summing the tens and ones digits)
Sum all digits (including the check digit)
Total mod 10 == 0 → valid

Example: 4532015112830366

Stripe test cards: 4242424242424242 (Visa), 5555555555554444 (Mastercard) — all pass Luhn but are well-known test numbers Stripe never actually charges.

Real QA scenarios

How I actually use this in fintech QA:

Account opening load tests: generate 10K valid IDs + business numbers + credit cards, drive the open-account API with JMeter / k6 → percentile analysis
Risk-rule regression: use IDs with different gender / region letters to verify rules don't misfire
Foreign-resident onboarding: test the new-resident ID variant (second digit != 1/2) flows
Credit card binding: test BIN-routing logic by feeding Visa / Mastercard / JCB cards of different brand prefixes
DB seed: when the test environment boots, seed 100 fake accounts so QA and PM can demo immediately

Important: data from the tw-test-data tool is for format testing only — using it to open real accounts or register for real services constitutes fraud.

Try it now: generate 100 IDs in the TW test data tool, feed them into your company's validator, and confirm 100% pass.