Back to all blogposts

How to prepare for a penetration test in 9 simple steps

Monika Sadlok

Monika Sadlok

QA Specialist

Internet is a useful source of information. Unfortunately, it also stores plenty of our personal data, making us a tasty morsel for potential e-thieves. I decided to write an article about a few methods of gathering the data which can be used to build a profile of a potential target – and to prepare for a penetration test. 

A penetration test is a form of information security assurance. Sometimes, it begins with an extensive reconnaissance phase. But how to prepare for a penetration test? There is some open-source intelligence (OSINT) techniques that can unveil a lot of data stored online. For example some information about companies that are characterized by a large presence online. Below, I’ll present a few steps you should go through to prepare for a penetration test.

Step 1: Begin with HaveIBeenPwned and WHOIS

Email addresses are one of the most vulnerable, online pieces of information about you. Whenever you are registering on a website or creating an account in an app – you are asked to provide your email address. And it often happens that websites or apps are being hacked. As a result – email addresses and all the data about the users may be leaked. There are some methods to verify whether a given email address has been intercepted.  

One of the most popular is HaveIBeenPwned. It tells you if your email has been found to be a part of a breach.

Source: https://haveibeenpwned.com

Who.is – another search engine. It can help you look through a provided website address to find IP history information, domain expiry date and even phone numbers that can be used in social engineering attacks.

Source: who.is

Both tools are rather simple so it’s definitely worth using them at the very beginning of preparation for a penetration test.

Step 2: Perform advanced searching with Google Hacking

A more advanced tool is the Exploit Database and Google Hacking. The first one is a “CVE compliant archive of public exploits and corresponding vulnerable software, developed for use by penetration testers and vulnerability researchers”. And Google Hacking is a part of Exploit Database – “categorized index of Internet search engine queries designed to uncover interesting, and usually sensitive, information made publicly available on the Internet. In most cases, this information was never meant to be made public but due to any number of factors this information was linked in a web document that was crawled by a search engine which subsequently followed that link and indexed the sensitive information”. It involves using advanced operators in the Google search engine to locate specific strings of text within search results.

Simple example: intext:”please find attached” “login” | password ext:pdf

It identifies interesting files (log files for example) which contain sensitive information and the full system path of the application using search queries like these presented below.

One of the more advanced tools which is often used before conducting penetration test is Exploit Database

Google Hacking Database at Exploit Database

It’s a great tool that can help you check whether your website is safe and if all the sensitive data is properly hidden on a website.

See also: How to make your software GDPR-ready?

Step 3: Check out robots.txt file for hidden, interesting directories

Most frameworks, content management systems or online shops have well-defined directory structures. That’s why normally the admin directory is under a /admin or a /administration request. If it’s not a case, the robots.txt  file will most probably contain the directory name you are looking for. That’s why it’s worth using this simple trick to obtain a directory name.

Example of the robots.txt from The Software House website

The robots.txt is a file stored in the main server catalogue. It helps to hide some directories on a website from robot search. There are two ways of hiding directories with the robots.txt file.

If you want to disallow robots from indexing the whole website, you should use a command:
User-agent: *
Disallow: /

If you want to disallow robots from indexing a particular directory (ie. “images”), you should use a command:
User-agent: *
Disallow: /images

When you are about to prepare for a penetration test, you should check the robots.txt file to see if any potentially interesting directories have been hidden. If the website administrator decided to hide these folders – it can mean that some important (or classified) pieces of information are stored there.

See also: Introduction to cryptography

Step 4: Look through the LinkedIn profile of the company

Most often, the weakest passwords in companies belonging to the non-tech management employees. That’s why LinkedIn may be a good source of information. Searching through this website will help you identify directors, senior managers and some other, non-technical staff members. Then you can verify whether their passwords are strong enough. Searching through the “About Us” page on the company website can lead you to find an easy target.

Based on the discovery of a couple of emails, a standard format for usernames can be derived. Sometimes it’s very helpful to use a password reset functionality.

Step 5: Perform IP address-related checks

Using reverse IP lookups, you can identify additional targets to poke around. Bing has an excellent search feature which uses IP. It’s capable of finding the websites which are hosted by a specific IP address. Using “IP: ***.***.***.***” in Bing browser may help you find which website is hosted by the provided IP address.

Thanks to Bing you can verify which websites are hosted by the given IP address

Step 6: Enumerate subdomains

Subdomain enumeration is one of the most important steps in assessing and discovering assets that have been exposed online by the client. It may have been done either deliberately as part of their business or accidentally due to a misconfiguration.

Subdomain enumeration can be done using variety of tools like dnsrecon, subbrute or knock.py. Alternatively, you can perform it using Google’s site operator or through websites like dnsdumpster or virustotal.com.

An example of using dnsrecon shows how to obtain subdomain names through brute force

Step 7: Check out HTTP status codes and response headers

Doesn’t matter if it’s a valid page, a non-existing page, a redirecting page or a simple directory name – whenever you’re investigating it, look for some subtle typos, extra spaces or redundant values in the response headers. Why is it so important? HTTP Header stores a lot of sensitive information, such as cookie strings or web application technologies. This kind of data can be used when troubleshooting or… whilst planning an attack against a web server.

Source: Burp Suite

Step 8: Make use of Shodan and Censys

Both Shodan and Censys are the tools that may help to find files, IP addresses, exposed services and error messages. Programmers at Shodan and Censys have painstakingly scanned the Internet. They enumerated services and categorised their findings making them searchable through simple keywords.

Shodan can be used to check which device is connected to the Internet, who uses it and where it is located. Censys allows users to discover the devices, networks, and infrastructure on the Internet and monitor how it changes over time.

Shodan shows which devices are connected to the Internet

Step 9: Browse the site’s HTML

Content like images, JS and CSS files may be hosted on S3 buckets owned by the client. Buckets are simple storage services. They allow storing the objects through a web service.

It may be possible to identify if the client uses cloud infra to host static/dynamic content while performing standard reconnaissance.

In such cases finding buckets which are used by a client can be really rewarding. Especially if a client has misconfigured permissions on the buckets.

Tools like DigiNinja’s Bucket Finder can be used to automate the search process by brute-forcing names of buckets. This tool requires a well-curated list of bucket names and potentially full URLs to be effective.

A private bucket (like the example below) will not disclose files and resources.

Private bucket should not disclose files and resources

A public bucket shows the names of files and resources (like one of the examples below). These files can then be downloaded using full URLs.

On the other hand, public bucket is normally showing the names of files

Summary

Open-source intelligence (OSINT) also known as “reconnaissance” is the first step of a penetration test. It’s an ever-growing and continuously enhancing the field of study. Presented techniques are only the tip of the iceberg, but these nine steps are important parts of the aforementioned reconnaissance. Using these simple techniques may help you build a profile of a target, reveal several weaknesses and prepare for a penetration test. After performing all of the checks above – you can step forward to a regular pen-test. But this is a broader subject for a separate article.

Want to know even more about the subject of testing? Check our open-source end-to-end test automation tool.

You may also like

What do you want to achieve?





    You can upload a file (optional)

    Upload file

    File should be .pdf, .doc, .docx, .rtf, .jpg, .jpeg, .png format, max size 5 MB

    Uploaded
    0 % of

    or contact us directly at [email protected]

    This site is protected by reCAPTCHA and the Google
    Privacy Policy and Terms of Service apply.

    Thanks

    Thank you!

    Your message has been sent. We’ll get back to you in 24 hours.

    Back to page
    24h

    We’ll get back to you in 24 hours

    to get to know each other and address your needs as quick as possible.

    Strategy

    We'll work together on possible scenarios

    for the software development strategy in sync with your goals.

    Strategy

    We’ll turn the strategy into an actionable plan

    and provide you with experienced development teams to execute it.

    Our work was featured in:

    Tech Crunch
    Forbes
    Business Insider

    Aplikujesz do

    The Software House

    CopiedTekst skopiowany!

    Nie zapomnij dodać klauzuli:

    Kopiuj do schowka

    Jakie będą kolejne kroki?

    Phone

    Rozmowa telefoniczna

    Krótka rozmowa o twoim doświadczeniu,
    umiejętnościach i oczekiwaniach.

    Test task

    Zadanie testowe

    Praktyczne zadanie sprawdzające dokładnie
    poziom twoich umiejętności.

    Meeting

    Spotkanie w biurze

    Rozmowa w biurze The Software House,
    pozwalająca nam się lepiej poznać.

    Response 200

    Response 200

    Ostateczna odpowiedź i propozycja
    finansowa (w ciągu kilku dni od spotkania).

    spinner