BookToAnki Logo
Text Analyzerunknown word ratio calculator

Unknown Word Ratio Calculator

Paste a passage and estimate how much of it may sit above your current reading level before you commit to the book or build the deck.

FreeInstant resultNo upload needed
Estimate unknown-word ratio against your current reading level.
Translate raw difficulty into reading-fit language.
Use the result before you turn a hard book into an even heavier deck.

Text analyzer panel

Paste text and watch the estimate update immediately.

Live analysis
B1 reader estimate: about 40.0% of this sample may sit above your current vocabulary comfort zone.
Unknown-word ratio
40.0%
Estimated words above B1.
Likely known coverage
60.0%
Higher known coverage usually makes sustained reading easier.
Estimated text level
C1
Low confidence based on the current sample.
Reading fit
Likely too hard
Heuristic fit based on vocabulary gap rather than official assessment.

Takeaways

This likely pushes too much decoding into every page. Consider an easier text, a shorter excerpt, or stricter extraction filters.

Because the sample is short, treat the percentage as directional rather than exact.

Potentially difficult signal words include justify, preliminary.

Matched signals

justifypreliminary

How to interpret this tool

A book can look attractive in the abstract and still be wrong for your current level because the unknown-word ratio is too high. That is what kills reading flow first.

This tool compares the sample against your current CEFR level and estimates whether the text is readable, stretchable, or probably too expensive for sustained reading.

Usage notes

Treat short samples as directional only. The tool becomes more useful when the text reflects the real body of the book.
If vocabulary looks manageable but the text still feels hard, syntax or topic familiarity may be doing the damage.
Use the result to compare texts against each other, not to pretend the score is an official assessment.

FAQ

Is this percentage exact?

No. It is a planning heuristic. The point is to estimate whether the passage looks readable enough to preserve momentum, not to classify every word perfectly.

What is a safe unknown-word ratio for extensive reading?

Many readers feel comfortable when the unknown share stays very low. Once it rises noticeably, reading starts to feel more like decoding than contact with the language.

Can a text still feel hard even with a decent coverage estimate?

Yes. Topic familiarity, sentence length, and style still matter. Coverage is a strong signal, but not the only one.

Related Reading

If the ratio is manageable, keep the deck smaller than the text

Use BookToAnki to filter extraction and only keep the words that deserve review, not every unfamiliar item in the sample.

Build a Selective Deck