DeepCrawl is a web-based crawling tool for everyone who wants to crawl the site. With DeepCrawl Automator, you can reduce the risk of loss organic traffic, rankings, and in turn overall revenue.
➞ Schedule automatic crawls.
➞ Issues and site structure can be broken down.
➞ Insights on SEO benchmarks that other tools don’t provide.
➞ Expensive enterprise plan.
➞ Difficult to set up sub-domains.
I know having technical SEO and performance right for your website is a bit difficult task, especially when you don’t have the right tools to monitor.
What if I say there’s a marketing tool which lets you crawl and analyse your website to determine key insights to improve your sites’ performance?
You must be thinking, if there’s such a tool, you would have known.
But, sometimes you don’t come across the right tool that can help you in many ways. And, such is the case with DeepCrawl.
If you have used the DeepCrawl, then you know how powerful the tool is, but if you haven’t, this guide would help you to know it.
In this article, we will understand when and how to use DeepCrawl for your site in various situations.
DeepCrawl is an enterprise cloud-based web crawling technology designed by experienced SEOs. It can run any size of crawls (including large size) which is not possible in software-based crawlers run on local computers.
With DeepCrawl, you can monitor and fix the problems that may come after the websites are launched or migrated. This is very important as you already fixed the issue that Google was about to penalize.
Apart from identifying problems, DeepCrawl can also be used for:
- Technical Auditing
- Link Auditing
- Competitor Intelligence
- Landing Page Analysis
- Site Migrations
- Website Optimization
- Website Testing
- Competitor Analysis
In the next section, we will see how to use DeepCrawl for your site.
DeepCrawl can be used for various purposes. Here, I will uncover some of the most important areas for which SEO experts use DeepCrawl.
Frequently running website crawling through DeepCrawl helps you to monitor the complete overview of your websites’ technical health. You can crawl any website with a quick four steps as follows.
Firstly, when you sign up for DeepCrawl, you’ll be redirected to the project dashboard (if you’re using it for the first time, then you’ll be asked to create a project).
You have to enter your domain name for which you want to perform crawling, then assign a name to your project. After that, click on “Save and Continue”.
Secondly, you have to select URL sources from the list that you want to include while crawling. However, you can add or remove sources later on.
- Website: Crawl subdomains as well as HTTP/HTTPS.
- Sitemaps: Upload an XML Sitemap, the links on those pages will not be crawled.
- Backlinks: Upload backlink target URLs with backlink count as a CSV file. If any new URLs not found in any other sources will be crawled.
- Google Search Console: Crawl URLs that are found in Search Console Analytics.
- Analytics: Crawl a set of URLs found in your web analytics (such as Google Analytics).
Thirdly, you have to set the “Crawl Speed & Depth”. I would suggest you select a low quantity, so you don’t have to wait for long hours to see the results. (If you are using a free trial, then your URL limits will be capped at 1,000).
Lastly, you have to determine whether the crawl will be one-time or recurring, which can be hourly, daily, weekly, fortnightly, monthly, or quarterly.
Then, hit the “Start Crawl” and you will receive an email once your crawl is completed.
DeepCrawl identifies SEO problems in your website and lists out all the issues after crawling. If you think your website needs recurring crawling, then you can schedule crawls on an hourly, daily, weekly, or monthly basis and you will receive reports in your inbox.
Suppose your clients need an instant update on their website, you just have to go to your latest crawl and share the data to the client. Additionally, DeepCrawl shows all the changes since the last report.
If you have found some issues and want to allocate that problem to a specific team member, then you can do it easily with the Task Manager. Here, you can describe the problem, priority of the task, and a deadline to completion.
A good site structure enables search engines to find and index your site easily. DeepCrawl provides you with various information regarding your site structure, so you can identify opportunities for improvement.
Have you ever wondered how deeply bots have to go into your site to access certain pages?
This is crucial because it lets you know how many clicks have to be performed by visitors to get on your most important pages. More the number of clicks, fewer the number of visitors.
Now, you can optimize your crawl budget, so that your visitors don’t have to click more links to visit your essential pages.
DeepCrawl also offers you to optimize your sitemap by finding duplicate and orphaned pages that aren’t linked internally.
Migrating a site is tricky, and if not done correctly, can cause big issues. Improper website migration leads to lost revenue and a drop in organic traffic.
If you use DeepCrawl before migration, you don’t have to worry as it provides a comparison between staging and live website to ensure you the differences that will occur after migration.
Now, with the results, you can create two versions and run another crawl to identify the issues that needed to be solved.
You can also allow DeepCrawl to check your sitemaps to check if some important pages are missing. Finally, you can crawl the modified URLs and test what will be the impact if you remove parameters from the URLs.
A thin content (or shallow content) will increase the chance of getting hit with the Panda penalty. It can also be possible if Google found too many blank pages and technical issues that hinder the user experience.
DeepCrawl uses analytics data and content algorithms to detect duplicate title tags, meta description, and body content.
To optimize the Panda audit, you can use DeepCrawl as follows.
Firstly, make sure that you integrate Google Analytics to your DeepCrawl account. It crawls the site, XML sitemaps and landing pages as well as import Google Analytics data to identify any challenges on your site.
Navigate to the Site Explorer report, then under the Analytics mode, you can identify low-quality factors on your site by analyzing average bounce rate, page views, etc.
Additionally, “Content Size” section lets you know whether the content is “thin” or not. If the content is regarded as thin, then there’s the possibility that you’ll face the Panda penalty.
If you’re using DeepCrawl for a long time, and you think that it can be used for crawling purpose only, then you should start utilizing custom extraction.
Custom Extraction lets you find the most important information from the HTML of your site. When you start using this feature, it provides more granular data that can be useful for you.
We already discussed how to analyze site health and performance, but if you want to go beyond that, you’ll have to use a line of regex (regular expression). Regex creates a search pattern to locate certain sections on your site’s code.
Various regular expression languages are available, but DeepCrawl uses Ruby.
You may find it difficult to write regexes, that’s why you have to first understand the basics of a regular expression.
You can use custom extractions for various sections such as 404 errors, product stock status, images, schema, event tracking, Google analytics code, Google tag manager, content categories, and many more.
Note: Google Analytics also supports regex which helps you to filter reports within the tool.
Sometimes website developers or owners release code without considering its impact on the performance of their site. It creates the risk for loss of organic traffic, rankings, and in turn overall revenue.
DeepCrawl Automator allows you to test your code before pushing the changes to the live environment. If any SEO issues are recognized, then it will instantly send a notification to you.
Now, we will see how the Automator works.
Firstly, it crawls the new code and generates a report if any critical issues are found. Although the tool is integrated with CI and CD pipelines which enables you to do a 360 QA analysis on any of your code.
Then, Automator will check for SEO issues, and alerts you if a certain issue exceeds a threshold level. You can select from more than 150 tests and run them in different testing processes across multiple QA environments.
You can receive notifications on both email and Slack. You can also integrate Jira to automatically create tickets.
You can request a demo to use DeepCrawl Automator.
It’s great to put my opinions in front of you about the features of the tool, and how it can help you to resolve technical issues to improve your sites’ performance – but that’s not everybody’s concern.
Some of you would like to know, should you buy DeepCrawl or is it a waste of your time.
Let’s break it down and see how it can ease your life.
If you are one of the largest enterprises, then diagnosing and fixing technical issues manually in your site is quite difficult. That’s why various enterprise brands use DeepCrawl to monitor and track the technical health of their domains.
If you have your target audience in various regions all around the world, it becomes essential to manage international sites. DeepCrawl helps you to set up your global content correctly so that you can connect with each of your regional audience.
DeepCrawl has a range of rich data sources which provides you with the insights of your sites’ organic growth and revenue. With Automatic Scheduling, you can set crawls over many domains, so you can get notifications about the relevant data at regular intervals.
The most common issue faced by eCommerce sites is search engines do not crawl many pages that should be indexed which results in an increase in crawl budget on the site.
DeepCrawl allows you to optimize URL structures so your sites’ most important pages can be crawled by search engines. You can avoid wasting the crawl budget by identifying the pages that search engine bots are accessing.
If you’re a publisher and your content does not get crawled and indexed by search engines as quickly as possible, then there will be a decrement in organic search visibility which will lead to less visitor engagement.
You can integrate your log files to DeepCrawl to analyze whether your pages are not indexed frequently or not. You can also test XML sitemaps to check if any of the important pages are missing.
DeepCrawl is built to provide agencies with easy task management and white-labeled reports for their clients. It helps you to monitor and analyze your clients’ site to increase organic traffic and revenue.
With DeepCrawl, you can crawl multiple sites at a single time, so you can work on multiple clients simultaneously.
Additionally, when you detect an issue, you can set it up in the Task Manager. Here, you can assign the issue to a specific team member along with the priority level and deadline for work done.
DeepCrawl offers three plans, including Light, Light Plus, and Enterprise. Recently, they revised their subscription plans to support individuals and SMEs.
DeepCrawl starts at $14 per month for its Light plan, which comes with 10,000 active URLs, one project and full API access. Most SEO tools offer API access in their enterprise plan, which gives DeepCrawl an edge over other SEO tools.
Light Plus plan of DeepCrawl costs you $62 per month, which comes with 40,000 active URLs, three projects and full API access. Although, you can schedule the crawl weekly or monthly in this plan.
You can integrate Google Analytics and Google Search Console in Light Plus plan, while there’s no partner integration in the Light plan.
For medium-sized businesses to large enterprise brands, DeepCrawl introduced an Enterprise plan where you can choose a custom number of URLs and projects.
Usually, the marketing tools offer you to subscribe to their service annually instead of monthly to save your money. And, you can get 2 months free if you opt for DeepCrawl plans annually.
You can check in-detail features with a comparison chart of all the plans, here. Nevertheless, you can try DeepCrawl’s 14-day free-trial with limited access.
DeepCrawl is a web-based tool that helps you to crawl your site and check if there’s an issue within your site. However, there are some other tools available in the marketplace that also allow you to crawl your site.
Netpeak Spider is a desktop tool that helps you in SEO audit, identify issues, website analysis and scraping. Although it doesn’t provide the same scalability as of its competitors, it is still considered as one of the powerful tools for SEO audits.
Netpeak Spider crawls your website to check if there are any broken links, images, duplicate content, canonical links, check redirects, robots.txt file, and more. Fixing these issues will improve your site visibility and organic traffic.
With Netpeak Spider, you can analyze data (generated by segmentation tools) to monitor the performance of your site using click depth, site structure, and word count.
You can generate white-label reports that are automatically saved as PDFs. To use this feature, you have to upgrade your subscription to Pro or Custom plans.
The Standard plan of Netpeak Spider starts from $19 per month which includes some basic features such as site optimization and scraping, website crawling, data segmentation, and SEO audit.
To crawl multiple sites at the same time instead of one, you have to upgrade your subscription to Pro plan, which starts from $39 per month.
Screaming Frog SEO Spider also known as SEO Spider is a website crawler that lets you crawl your website to analyze and audit technical and onsite SEO. The tool is best-known for presenting complex SEO data in an easy-to-understand format.
Unlike DeepCrawl, Spider SEO is not a cloud-based crawler, instead, it’s an offline website crawler that supports Mac, Windows, and Ubuntu.
Spider SEO comes with both free and paid versions. With the free version, you can find broken links, audit hreflang attributes, site visualization, generate XML sitemaps, identify duplicate pages, analyze page titles and meta description.
You have to buy a paid version if you want to increase the crawl limit as you can crawl only 500 URLs with the free version.
Spider SEO is considerably cheaper than other SEO tools as it costs £149 per year. You also get advanced features with the paid plan such as scheduling, custom robots.txt, custom extraction, Google Analytics integration, and many more.
DeepCrawl can be considered as the best crawling tool until now. As compared to other tools, it generates the most in-depth results for domain scanning and site structuring.
Recently, DeepCrawl is experimenting with Site Explorer Mode on their platform, and they successfully released the beta version. So, you can try the site explorer mode to recover from the penalty.
Crawling capabilities to scan your or your competitors’ site from top to bottom, DeepCrawl is definitely the best investment you can do.