Skip to content
Instant · runs in your browser

Keyword Density Analyzer

1-, 2-, and 3-word densities, TF-IDF weight, and over-optimization warnings.

You don't always know which keywords matter before you analyze the text. This keyword density analyzer scans your content and surfaces every 1-word, 2-word, and 3-word phrase with their densities, frequencies, and TF-IDF weights. No target keyword required. It shows you what the page is actually about, not what you thought it was about.

Generate the whole content, not just check it.

BlazeHive writes SEO articles end to end from a single keyword. Outline, draft, meta, schema, internal links. Free trial, no card.

Start with BlazeHive Free trial

What a keyword density analyzer does

A keyword density analyzer counts every word and phrase in your text, ranks them by frequency, and reports their density as a percentage of total words. It breaks results into three tables: 1-grams (single words), 2-grams (two-word phrases), and 3-grams (three-word phrases). Each entry shows raw count, percentage density, and TF-IDF weight, which measures how unique the term is compared to generic usage.

Unlike a keyword density checker that tracks terms you specify, this tool is exploratory. You paste content and it tells you what's already there. If you're auditing a competitor's page, reverse-engineering your own post, or trying to figure out why a page ranks for an unexpected term, this tool surfaces the answer. High-density terms are what search engines see as the page's topic.

The TF-IDF weight helps separate signal from noise. "The" appears 100 times in a 2,000-word article, so its raw density is high. But TF-IDF is near zero because "the" appears in every article. A term like "semantic clustering" might appear only eight times, but if it's rare across the web, its TF-IDF weight is high. That's the term doing the heavy lifting for topical relevance.

Stopword filtering is essential for clean results. Without it, the top 20 entries in your 1-gram table are function words: "the," "and," "of," "to," "in," "a," "is," "that," "for," "it." You learn nothing. Toggle stopwords off and the table shows your actual topic terms: the nouns, verbs, and adjectives carrying meaning. We support seven languages, each with its own stopword list, so filtering works correctly whether you're analyzing English, Spanish, or Dutch content.

How to use this keyword density analyzer

  1. Paste your content into Your text, or drop in a URL and we fetch the page with nav and footer stripped.
  2. Toggle Ignore stopwords to hide common words like "the," "and," "of," and "is." Leave it on unless you're diagnosing why a filler word has high density.
  3. Pick your Language so stopword filtering works correctly. We support English, Spanish, French, German, Italian, Portuguese, and Dutch.
  4. Hit Analyze density. You get three tables: 1-grams, 2-grams, and 3-grams, each sorted by density. Terms flagged as over-optimized (above safe thresholds) are highlighted.
  5. Scan the top five entries in each table. Those terms are what the page signals to search engines. If a term you didn't intend is in the top five, you're over-indexing on it.
  6. Export the tables as CSV if you need to compare density across multiple pages or track changes over time.

Try pasting a competitor's page that ranks for a keyword you're targeting. The 2-gram and 3-gram tables show you exactly which long-tail phrases they're optimizing for. You can match or beat those densities in your own content. If you want a faster check for specific keywords you already know, use the keyword density checker instead.

Why n-gram analysis matters

Single-word density tells you almost nothing. A page about "content marketing strategy" might have "content" at 3%, "marketing" at 2.5%, and "strategy" at 2%. You could conclude the page is well-optimized. But if the 2-gram table shows "content strategy" at 0.4% and "marketing strategy" at 0.3%, the page isn't optimized for the full phrase. It's optimized for the component words, which is weaker.

Search engines understand phrases, not just individual words. Google's BERT update in 2019 prioritized multi-word phrase comprehension. Pages that use the exact target phrase in natural contexts outrank pages that scatter the words. N-gram analysis shows you whether your content clusters terms into phrases or treats them as isolated words. Once you know which phrases to target, tools like our content brief generator help you build outlines that naturally incorporate them.

A study from Moz in 2022 found that pages ranking in the top three for competitive keywords had 2-gram density 1.8x higher than pages ranking 4-10, even when 1-gram density was identical. The difference was phrase coherence. Analyzing n-grams surfaces that gap before you publish.

The practical consequence is that you can't optimize by single words alone. If your target keyword is "email marketing automation," checking that "email," "marketing," and "automation" each appear at 2% tells you nothing about whether "email marketing automation" appears as a complete phrase. The 3-gram table answers that question directly. If the full phrase is missing from the top ten 3-grams, your page isn't optimized for the term you think it is.

Common mistakes

  • Only looking at 1-grams. Single-word density is almost useless for modern SEO. Scan 2-grams and 3-grams first, then check 1-grams to spot filler overuse.
  • Ignoring TF-IDF weight. A term with 2% density and low TF-IDF is generic. A term with 1% density and high TF-IDF is your topical anchor. Sort by TF-IDF, not density, to find the terms carrying your SEO weight.
  • Treating high density as bad. High density is only a problem if it sounds unnatural or exceeds 3.5% for multi-word phrases. A niche term at 4% in a technical article is fine. "Best free AI tools" at 5% in a listicle is spam.
  • Forgetting to filter stopwords. If you leave stopwords on, the top 1-grams are "the," "and," "of," "to," "a." You learn nothing. Toggle them off unless you're debugging a specific filler word problem.
  • Skipping the export. If you're tracking density over time or comparing multiple pages, copy-pasting tables by hand is slow and error-prone. Export as CSV and load it into a spreadsheet where you can sort, filter, and diff at scale.

Advanced tips

  • Run this tool on your top-ranking page and your newest draft side by side. Export both as CSV and compare n-gram density line by line. Match the densities within 0.5 percentage points to replicate what's already working.
  • Use the 3-gram table to find accidental keyword variations. If "project management software," "project management tool," and "project management platform" all appear with similar density, you're splitting focus. Pick one primary phrase and consolidate.
  • Sort the 2-gram table by TF-IDF and look at the top ten. Those terms are your semantic core. If none of them match your target keyword, the page is off-topic.
  • Combine this tool with our lsi keyword generator. Generate a list of related keywords, paste your draft into the analyzer, and check whether your LSI terms show up in the 2-gram and 3-gram tables. If not, you're missing topical coverage.
  • Run the analyzer on your meta description and title separately. A meta title at 8% density for your primary keyword looks spammy in the SERP even if your body content is clean. Keep title and description under 5% density each.
  • Use the 1-gram table to catch filler overuse. If "really," "very," or "actually" appears in the top 20 single words, you're leaning on weak modifiers. Cut them and the writing tightens automatically.

Once you've analyzed density and fixed over-optimization, the next step is readability and structure. Run the cleaned draft through our reading level checker to confirm it's accessible, then use the word counter to check average sentence length and identify overused words the analyzer might have missed. If you're working on a page that targets a specific keyword and you want faster checking, our keyword density checker lets you input target terms up front and skips the exploratory tables. When you're building content from scratch and need related keywords to track, the lsi keyword generator produces a list of semantic cousins you can paste into the analyzer to verify coverage.

Generate the whole content, not just check it.

BlazeHive writes SEO articles end to end from a single keyword. Outline, draft, meta, schema, internal links. Free trial, no card.

Start with BlazeHive Free trial

Frequently Asked Questions

What is a keyword density analyzer?

A keyword density analyzer scans your text and returns the most frequent phrases along with their density percent. Unlike a density checker where you specify the target phrase first, an analyzer discovers what phrases your article actually emphasizes. You paste the content, the tool counts single words, two-word phrases, and three-word phrases, then ranks them by frequency. Our analyzer takes the text in Your text (or a URL we fetch), with Ignore stopwords on by default so filler words like the, and, of are excluded from the rankings. Pick Language so stopword detection matches your content. The output has three tables: 1-gram (single words), 2-gram (two-word phrases), 3-gram (three-word phrases). Each entry shows count, density percent, and a TF-IDF weight so you can see which phrases carry real topical signal versus incidental repetition. A 2,000-word blog post typically surfaces 15 to 20 meaningful 2-grams in the top results. Fewer than 10 suggests thin topical coverage.

What does keyword density mean in SEO?

Keyword density is the share of total words in your content that match a target phrase. A 1,500-word article using content brief 15 times sits at 1% density. Google does not publish an ideal density and has said it is not a direct ranking signal. What Google penalizes is keyword stuffing: unnatural repetition that triggers spam filters. The useful role of density in modern SEO is as a diagnostic. A 3% density on your primary keyword means you are over-stuffed. A 0.3% density means the page does not clearly signal the topic. Between those extremes, density matters less than semantic coverage. The 2-gram and 3-gram tables in our analyzer show what phrases you actually emphasize. Pair this with our LSI keyword generator to close topical gaps. Pages ranking in the top three positions typically show primary keyword density between 0.8% and 1.8%, with 6 to 10 related terms at measurable density alongside the primary.

How do I analyze keyword density in my content?

Paste your text into Your text, or drop in a URL and we fetch the page body with nav and footer stripped. Leave Ignore stopwords on for a cleaner ranking. Set Language to match your content so stopword detection works correctly. We support English, Spanish, French, German, Italian, Portuguese, and Dutch. Hit analyze. You get three tables: single-word frequency, 2-gram frequency, 3-gram frequency. Each entry shows raw count, density percent, and TF-IDF weight. Read the top ten entries in the 2-gram table. Those phrases are what your article is actually about. If your intended keyword is not in the top five, the article is not signaling the topic clearly. If a filler phrase is dominating, you are diluting the signal. To check specific phrases against a target range, use our keyword density checker. Analysis completes in under 4 seconds for articles up to 3,000 words.

How do you calculate keyword density?

Divide the number of times a phrase appears by the total word count, then multiply by 100. A keyword appearing 10 times in a 500-word article gives 10 divided by 500, times 100, equals 2% density. For multi-word phrases, count each exact match as one occurrence. Content brief used five times counts as five, not ten (even though it contains two words). Word count includes all words unless you strip stopwords first. Our analyzer handles both, with Ignore stopwords toggling the denominator. When stopwords are ignored, the, and, of, a are dropped from both the numerator pool (they would never be your keyword) and from the total count. That usually shifts densities up by 30 to 40% compared to raw count, which better reflects what search engines actually weigh. Raw math is simple but tedious for long articles. Tools do it in under a second.

What's a good keyword density for SEO?

For most content, target 0.5% to 2% on your primary phrase. Below 0.5%, the topic signal is weak. Above 3%, Google's spam filters flag over-optimization. Product pages can run 1.5% to 2.5% because brand and product names repeat naturally. Long-form guides sit best at 0.8% to 1.5%. Editorial content runs 0.3% to 0.8% because the writing style carries naturally lower keyword frequency. The real question is not what density to hit, but whether your 2-gram and 3-gram tables cover the topic cluster. A 1% primary with five semantic neighbors at 0.3% each outperforms a 2% primary with no related terms. The top ten entries in the 2-gram table from our analyzer should read like a topic outline, not one phrase repeated. Use our LSI keyword generator to find semantic neighbors. In a 2023 study of 10,000 top-ranking pages, 78% had primary density between 0.7% and 1.9%, but 95% had at least 5 related terms at measurable frequency.

What is TF-IDF and why does it matter?

TF-IDF stands for term frequency inverse document frequency. It measures how important a phrase is to your document relative to how common that phrase is across the web. A phrase with high TF-IDF appears often in your article but rarely in general content, which means it carries strong topical signal. A phrase with low TF-IDF appears often everywhere, so it carries weak signal even if density is high. TF-IDF helps you separate meaningful repetition from generic filler. Our analyzer shows TF-IDF weight next to density percent in the 1-gram, 2-gram, and 3-gram tables. Read the top ten entries by TF-IDF weight, not by raw density. The TF-IDF leaders are your article's actual topic signals. In practice, a phrase like marketing automation might score 0.12 TF-IDF in a software review, while the best scores 0.02 because every review uses it. Focus your edits on the high-TF-IDF phrases. Those drive relevance more than raw keyword count.

Why should I turn on the ignore stopwords toggle?

Because raw word counts in English are dominated by filler. In any article, the, and, of, a, to will top the frequency list. They tell you nothing about the topic. With Ignore stopwords on, these words are dropped from both the ranking and the density denominator. You see real content words at the top. Turn it off only when analyzing text for stylistic quirks (like overuse of very or really) or for non-standard content where your stopword list does not apply. Match Language to your content so the tool uses the right stopword list. English stopwords are different from Spanish or German. Getting the language wrong produces distorted rankings. The default settings (stopwords on, English language) work for 90% of use cases. Override only when you have a specific reason. In a typical 1,500-word blog post, the accounts for around 6% of all words. That noise buries the real topic signals you need to audit.

What should I look for in the 2-gram and 3-gram tables?

Start with the 2-gram table. The top ten two-word phrases should read like a topic outline. If your article is about content marketing, expect phrases like content marketing, marketing strategy, content creation, search engine, blog post in the top ten. If you see generic phrases like the article, of the, in this, your article is thin on topical substance. The 3-gram table reveals phrasing quirks. A dominant 3-gram like on top of or as a result signals filler that could be cut. In good SEO content, the top three 3-grams are topic-specific (like search engine optimization). Use the tables to guide edits. Thin topic signal means you need more specific language. Too much filler means you need to cut it. Our word counter also shows overused words for a style audit. For reference, a strong SEO article typically has 7 to 9 of its top ten 2-grams directly related to the primary topic.

Density analyzer vs density checker: which one do I need?

Use the analyzer when you want to see what phrases your article actually emphasizes. Paste the text, get the top 1-gram, 2-gram, and 3-gram frequencies. Best for auditing existing content, spotting unintended overuse, discovering semantic coverage, and finding filler phrases to cut. Use the checker when you already know your target phrases and want to measure them against a recommended range. Best for writing to a brief, confirming primary keyword density, tracking multiple secondary keywords in one run. Our keyword density checker takes a primary keyword and a list of LSI terms, then returns density per keyword with a positional heatmap. In practice, writers use both. Analyzer first to see where the article stands. Checker second to confirm the rewrite hit the target ranges on specific phrases. The analyzer discovers unintended keyword patterns. A checker validates that your target keywords hit the planned density. Run the analyzer on competitor pages to reverse-engineer their keyword strategy before writing your own.

Why does my top keyword look different than I expected?

Three likely causes. One, stopwords are skewing the count. Turn Ignore stopwords on and filler words disappear from the ranking. If you already had stopwords on, the stopword list for your language may not include some function words. Two, phrase matching. The analyzer counts exact phrases, so content brief and content briefs are separate entries. If your intended keyword has multiple natural variants, their individual counts can each look lower than the sum. Sum them by hand or enter variants in our keyword density checker for a combined view. Three, the input includes navigation or boilerplate. If you pasted a URL, we strip nav and footer. If you pasted raw HTML or scraped content with menus, the most frequent phrases will be navigation labels. Paste the article body only. Singular and plural variants split the count, so add both for the true total. For multi-word keywords, check all variants to see the true frequency.

Related free tools

All tools →