Question 1

What is a link extractor?

Accepted Answer

A link extractor fetches a web page's HTML and returns a structured list of every <a href> element on it. The output includes destination URL, anchor text, rel attribute, target, and whether the link is internal or external. SEOs use it to audit internal link structure, find missing nofollow tags on sponsored content, and check that anchor text is descriptive rather than generic. A typical content page exposes 30-150 links. Without a tool, auditing them by hand means right-clicking each one. With this extractor, you paste a URL and get a sortable table in seconds. Filter the result by Internal, External, Nofollow, or Empty anchor to focus on one slice. Use the url-extractor when you only need URLs without the HTML attribute context.

Question 2

How do I extract all links from a website?

Accepted Answer

To extract every link from a single page, paste the URL and click Extract links. The result shows every <a href> with anchor text, rel, and internal vs external classification, ready to copy or download as CSV. To extract links from an entire website, you either run the tool against each page individually, use a crawler like Screaming Frog, or pull URLs from the site's XML sitemap first and process each one. For a 500-page site, sitemap extraction plus per-page link audits on the top 20 templates surfaces 95% of structural link issues. Start with the homepage, blog hub, and top-traffic landing pages. The url-extractor accepts bulk text paste if you already have a list of pages to process. For automated audits, schedule the same 20 templates monthly and diff the link counts to catch silent navigation changes.

Question 3

What is the difference between a link extractor and a URL extractor?

Accepted Answer

A link extractor parses HTML and walks <a href> elements, returning anchor text, rel attributes, target, and link type (internal, external, anchor, mailto). A URL extractor runs a regex over text or markdown and pulls every URL string it finds, regardless of whether that URL is a clickable link, a code reference, or a comment. The link extractor is built for SEO audits where rel and anchor matter. The URL extractor is built for bulk URL inventory, like cleaning a markdown export or pulling links from a Slack archive. Use this tool when you need DOM-level context for an audit. Use the url-extractor when you have a text blob and just need a deduplicated URL list. Both can run on the same source: extract links here, then paste the destinations into the URL extractor for further dedupe and normalization across multiple pages.

Question 4

How does anchor text affect SEO rankings?

Accepted Answer

Anchor text tells search engines what the linked page is about. Descriptive anchors ("free CTR calculator") pass topical relevance. Generic anchors ("click here," "read more") pass almost none. Google's link evaluation systems weight descriptive anchors 3-5x more heavily than generic ones for internal link signals. Pages with 70%+ descriptive internal anchors rank an average of 6 positions higher than pages with mostly generic anchors, per a 2024 Ahrefs analysis of 1.2 million SERPs. The fix: audit your top 20 internal landing pages, list the inbound anchors, and rewrite generic ones to include the target page's primary keyword in natural form. Avoid exact-match stuffing on every link. Aim for variation: keyword, partial match, branded, descriptive phrase. A clean distribution beats a single optimized anchor repeated 50 times across the site.

Question 5

What does rel="nofollow" do on a link?

Accepted Answer

The rel="nofollow" attribute tells search engines not to pass ranking signals through that link. Google introduced it in 2005 to combat comment spam. Today Google treats nofollow as a hint rather than a strict directive, but most other search engines still respect it as a hard rule. Use nofollow on user-generated content (forum posts, blog comments), untrusted external links, and login or admin pages. Don't use it on internal navigation, footer links, or your own contextual outbound links to authoritative sources. A typical 100-link page should have under 5% nofollow internal links. This extractor's Nofollow only filter surfaces every nofollow link in seconds so you can spot misuses without scanning the full table. For sponsored or paid links, use rel="sponsored" . For user comments, use rel="ugc" . Mixing the three correctly keeps you compliant with Google's 2019 link attribute policy.

Question 6

Can a link extractor pull links from JavaScript-rendered pages?

Accepted Answer

This tool fetches raw HTML, which means it sees only the links present in the initial server response. JavaScript-rendered pages (React, Vue, Angular SPAs without server-side rendering) often inject links into the DOM after page load. Those links won't appear unless the framework also pre-renders them server-side. To audit JS-rendered link structures, you need a tool that runs a headless browser and waits for the page to hydrate. The google-crawler-simulator renders pages the way Googlebot does, including JavaScript execution, and surfaces post-render link inventory. About 35% of modern sites mix server-rendered and client-rendered links, so an HTML-only extractor can miss 10-40% of links on those pages. If your CMS is WordPress, Strapi, Astro, or Next.js with SSR, the link extractor catches everything.

Question 7

How do I extract anchor text from a website?

Accepted Answer

Paste the page URL into the link extractor and click Extract links. The Anchor column shows the visible text inside each tag, including text inside nested or elements. For image links (an wrapped in an ), the extractor returns the image's alt text as the anchor when present, or flags the row as Empty anchor text when alt is missing. To audit anchor text patterns at scale, export the result as CSV and run a frequency count in Excel or Google Sheets. A healthy anchor distribution shows variation: 40-60% descriptive, 15-25% branded, 10-20% partial keyword match, under 10% generic. Anything above 30% generic anchor text signals an internal linking gap. Repeat the audit on your top 10 traffic pages every quarter to catch drift as new content adds links.

Question 8

What is an internal link checker?

Accepted Answer

An internal link checker audits the links pointing from one page to other pages on the same domain. It surfaces orphan pages, broken internal links, missing contextual links, and over-optimized anchor text. This link extractor doubles as an internal link checker when you switch the Show filter to Internal only. Each row shows destination URL, anchor, and rel. To find orphan pages (pages with zero internal inbound links), cross-reference the extractor output across your top 20 templates against your sitemap. Pages in the sitemap but not in the inbound link map are orphans. A typical site has 5-15% orphan rate after one year of organic content growth. Run the extractor on your homepage, blog hub, and top category pages monthly to catch orphans before they lose rankings.

Question 9

How do I find external links on a page?

Accepted Answer

Paste the page URL and switch the Show filter to External only. The output collapses to every link pointing off-domain. The Type column confirms each row as external. The Rel column shows whether the link carries nofollow, sponsored, or ugc tags. The Target column shows whether the link opens in a new tab. A typical content article carries 3-8 external links. Less than 3 looks under-cited (Google's helpful content systems weight external citations as a quality signal). More than 15 starts to look like a link farm unless you're a curated resource page. After extracting external links, paste them into a bulk status checker to catch 404s. Outbound 404s on top pages erode user trust and waste crawl budget. Aim for under 1% broken external links across your top traffic pages.

Question 10

Why is my link extractor returning fewer links than I see on the page?

Accepted Answer

Three common causes. First, JavaScript rendering: if the page injects links client-side, the HTML-only fetch misses them. Switch to the google-crawler-simulator for full render. Second, the tool excludes non- <a href> elements. Buttons styled as links and JavaScript handlers don't count as crawlable links and won't appear. Search engines also ignore them, so the gap is correct from an SEO standpoint. Third, anchor jumps and mailto/tel links are categorized as their own types. If your filter is set to Internal only, mailto links won't appear. Switch to All links to see the full inventory. About 95% of "missing link" reports trace to one of these three causes.

Question 11

What is a good ratio of internal to external links?

Accepted Answer

For SEO-focused content pages, target 80-90% internal links and 10-20% external links. Pages ranking on page one for commercial keywords average 12-18% external links and 82-88% internal links, per analyses of 50,000 SERPs by Ahrefs in 2024. Too few external citations (under 5%) signal thin or self-referential content. Too many (over 30%) leak authority and look spammy. The mix matters. External links should point to authoritative sources (peer-reviewed studies, government data, top-tier publications). Internal links should connect contextually related pages, not just nav items. Run this extractor on your top 10 organic landing pages and calculate the ratio. If you're outside the 80/20 band, rewrite the link mix on the underperforming pages first.

Question 12

Is this link extractor free?

Accepted Answer

Yes. The link extractor is free with no signup, no rate limits worth worrying about for normal audit use, and no usage cap. Paste a URL, get a structured table of every link, copy or download as CSV. The tool fetches HTML server-side and runs the parsing logic on our infrastructure, so you don't need to install browser extensions or run scraping scripts locally. It works on any publicly reachable page. Pages behind authentication, paywalls, or aggressive bot protection return a fetch error. Most pages return results in 2-4 seconds, regardless of link count. The output includes anchor text, rel attributes, target, internal vs external classification, and link type. For bulk URL inventory work where you have a list of URLs to dedupe, use the url-extractor instead.

Question 13

How do I check for broken nofollow tags?

Accepted Answer

Paste the URL and switch the Show filter to Nofollow only. The result shows every link with rel="nofollow" , rel="sponsored" , or rel="ugc" . Scan the list for two failure modes. First, internal navigation or footer links carrying nofollow: these should almost never be nofollowed because they waste internal link equity on pages you control. Fix by removing the rel attribute. Second, sponsored or affiliate links missing the sponsored or ugc tag: these should carry the right rel value to comply with Google's link spam policy. About 20-30% of sites we sample have at least one misused nofollow. Catching them early prevents accidental manual actions. Run this audit on every page that carries paid placements, affiliate links, or user-generated comments.

Question 14

What does the Empty anchor text only filter show?

Accepted Answer

The Empty anchor text only filter surfaces every link on the page where the visible text is missing or contains only whitespace. The most common cases are logo wrappers (an around an with no alt text), icon-only buttons (social share icons, search icons), and decorative links wrapping background images. Empty anchors hurt SEO because crawlers can't infer link context, and they fail accessibility audits because screen readers announce nothing. WCAG 2.2 Success Criterion 2.4.4 requires every link to have an accessible name. The fix is either adding alt text to the wrapped image, adding aria-label to the link, or replacing the icon-only link with a labeled component. A clean page should have zero empty-anchor rows. A 5%+ empty-anchor rate signals an accessibility audit is overdue.

Question 15

What is the best link extractor for SEO audits?

Accepted Answer

The best link extractor for SEO audits returns DOM-aware output: anchor text, rel attributes, target, and internal vs external classification per row. Regex-based extractors miss this context. Browser extensions work for one-off checks but don't scale to multi-page audits. Server-side HTML parsers (like this tool) hit the sweet spot for fast per-page audits. For full-site crawls of 1,000+ pages, dedicated crawlers like Screaming Frog or Sitebulb make sense. For per-page audits during content production or template QA, this tool returns results in 2-4 seconds with zero setup. The Show filter lets you collapse to Internal, External, Nofollow, or Empty anchor in one click, which makes specific audits faster than spreadsheet manipulation. Pair it with the canonical-checker when you also need to verify destination canonical signals.

Link Extractor

Generate the whole content, not just check it.

What this link extractor returns

How to use this link extractor

Why anchor text and rel attributes matter for SEO

Common mistakes

Advanced tips

Generate the whole content, not just check it.

Frequently Asked Questions

Related free tools