Site crawling

Theory

When requesting a web application, the server usually sends code (in HTML, CSS, Javascript, ...) in the response. This code is then rendered by the web browser. Each page contains links to other pages of the web app and resources needed by the browser to improve the render.

Crawling is a technique used to recursively follow those links and build the indexed website architecture. This architecture sometimes contains interesting links (admin log-in pages, API...) testers can focus on.

Practice

Tools like hakrawler (Go), scrapy (Python) and spidy (Python), and many other tools can be used for that purpose.

bash

echo $URL | hakrawler -d 10

Burp Suite's graphical interface is a great alternative (Dashboard > New scan (Crawl) then Target).

Once the crawling is over, testers need to inspect the website architecture and look for admin paths, unusual redirections and anything that could lead to a potential vulnerability.

Reconnaissance

Movement

Credentials

Dumping

Bruteforcing

MITM and coerced auths

NTLM

Kerberos

Forged tickets

Delegations

DACL abuse

Netlogon

Certificate Services (AD-CS)

SCCM / MECM

Exchange services

Print Spooler Service

Schannel

Built-ins & settings

Kerberos

Certificate Services (AD-CS)

HTTP security headers

Identity and Access Management

File inclusion

LFI to RCE

Initial access (protocols)

Windows

UNIX-like

(AV) Anti-Virus

Mifare Classic

Android

Site crawling

Theory

Practice

Dumping

Bruteforcing

MITM and coerced auths

NTLM

Kerberos

Forged tickets

Delegations

DACL abuse

Certificate Services (AD-CS)

SCCM / MECM

Certificate Services (AD-CS)

HTTP security headers

File inclusion

LFI to RCE

Windows

UNIX-like

Mifare Classic

Site crawling ​

Theory ​

Practice ​

Site crawling

Theory

Practice