Top 5 OSINT Sources for Penetration Testing and Bug… | Intel 471 Skip to content
blog article

Top 5 OSINT Sources for Penetration Testing and Bug Bounties

Sep 11, 2021
Top 5 osint penetration testing

Probably the most frequently asked question we get from SpiderFoot users is “with so many options available, what API keys should I get for my use case?” So, we asked hakluke and dccybersec to go on a mission and figure out the top 5 for the three most common SpiderFoot use cases: Penetration Tests / Bug Bounties, Threat Intelligence, and People Investigations. This is the first post in the three part series focusing on Penetration Tests / Bug Bounties, and we hope you find it useful!


Keep in mind that all references to the pricing of these services were valid at the time of writing but are likely to have changed, so always visit the website of the service to get the latest pricing.


Now let’s get started…



One of the key trends in information security over the last decade has been the proliferation of the concept that monitoring an organization’s external assets is critical to their overall security posture. As a result, many organizations have sprung up with the sole purpose of providing internet-wide scan data in easily-consumable formats. This concept has become more popular because attackers have realised that this is also an effective way of attacking an organization.


As a penetration tester or bug bounty hunter, having high-quality data at your fingertips is a huge asset. It allows you to quickly identify an organization’s external attack surface and monitor changes over time.


The Requirements


The types of data that we are most concerned with when mapping out the attack surface of an organization include:


  • Root domains that are owned by the organization
  • IP addresses that are owned/used by the organization
  • Subdomains that are owned by the organization
  • Technologies and third-party services that are in use
  • Exposed ports/services
  • Response data of exposed ports/services (HTTP response data, screenshots, banners etc.)
  • Historical data (particularly DNS and WHOIS)

Benchmarking Data Sources


It was very tough to narrow the list down to just five sources. There are many different data sources out there. Each of them have unique use-cases, some of them specialise in a very specific type of data source (such as reverse whois) while others generalise. For these reasons it is very difficult to do a direct comparison of each data source. Attempts at comparison via benchmarking simply don’t make sense. As such, this blog post should not be interpreted as a comparison of the services we mention, rather consider it to be a roundup of data sources that are generally considered to be leaders in the industry with very comprehensive data in their own right.


SecurityTrails


SecurityTrails was founded in 2017. Considering it’s relative age, SecurityTrails have some impressive offerings and a very comprehensive set of data.


Their offerings include:


API: Access all data with a fast, clean JSON API including DNS record history, historical WHOIS data, ports, subdomains, associated hosts and website technologies.


Feeds: Generate custom data feeds to monitor changes at regular intervals.


SurfaceBrowser: Browse the detailed attack surface of any organisation through a simple web-based interface.


Attack Surface Reduction: View and monitor the attack surface of your own organisation through a simple web-based interface.


Pricing


The only offering which includes a public pricing scheme is the API offering – all other offerings require you to make a custom enquiry. Thankfully – the API is probably the most useful offering for a penetration tester or bug bounty hunter anyway, because it can be used either as a browsable data source, or integrated with your own custom tooling.


  • The free offering allows 50 API requests per month, it includes the ability to query for current DNS records, historical DNS records, subdomains and IP details.
  • The “Prototyper” plan costs $50 per month, it includes 1,500 API requests per month, all of the features of the free plan plus current/historical whois data, reverse whois, and commercial use.
  • The “Professional” plan costs $500 per month, it includes 20,000 API requests per month. It includes all of the features above plus reverse DNS searching, associated domain searching, a 1 hours on-boarding call and the ability to return custom data in custom formats by utilising SQL.
  • The “Business” plan costs $1500 per month, it includes the same features as the Professional plan, but allows 65,000 requests per month.

Then they offer an “Enterprise” plan, which allows for custom pricing schemes and requires you to contact the sales team.


Data quality


The data accessible through the SecurityTrails API is exceptional. It is comprehensive both in terms of the types of data available, but also the quantity of data that is returned. A quick query of subdomains of tesla.com returned 410 results, the same query for cnn.com returned 1278. Just to reiterate – the amount of subdomains returned is not a good way to determine the overall quality of a data source, but it’s interesting to note nonetheless.


The API in general is very snappy. Total time for a basic curl request was about 1.1 to 1.6 seconds. Of course, your results will vary depending on your location and the query that you are making.


Shodan


Shodan was founded in 2009 and is designed as a search engine for internet connected devices. It allows you to browse devices by location, images (RDP screenshots and IP camera images) and extract lists of hosts by using various filters.


They also offer a monitoring solution where you can provide an IP range and receive alerts for changes on those IP addresses. Shodan is really an incredibly powerful data source. It is very interesting to be able to browse the internet in this way because it lends itself to discovering interesting types of devices that are exposed to the internet: IP cameras, nuclear power plants, fridges, incinerators, you name it.


In terms of asset discovery, it also has some powerful features. One excellent way to discover assets owned by a particular company is to search the contents of the SSL certificate. For example, to discover assets owned by Shopify, you could use the following search:


ssl:"Shopify Inc."


Pricing


  • If you’re just curious, Shodan offer a $49 one-time payment for a membership that allows you 100 query credits per month, 100 scan credits per month, 16 monitored IPs, and more.
  • The “Freelancer” plan costs $59 per month. It allows 10,000 query credits, 5,120 scan credits and 5,120 monitored IPs.
  • The “Small Business” plan is $299 per month. It allows 200,000 query credits, 65,536 scan credits and 65,536 monitored IPs, and also filtering by the “vuln” tag.
  • The “Corporate” plan is $899 per month. It allows unlimited query credits, 327,680 scan credits and 327,680 monitored IPs.

They also offer a custom-priced “Enterprise” plan which requires contacting the sales team.


Data Quality


The quality of the data is excellent. The filters and different formats (lists, maps, image galleries) in which you can view data make it a very interesting platform even just to browse. I will say though, the way in which the data presented seems more based around IP addresses than hostnames. The data isn’t really geared towards enumerating subdomains, rather it is geared towards monitoring IP ranges or discovering hosts running particular technologies/services. That’s not necessarily a bad thing – it just depends on the nature of the targets that you’d like to monitor.


One of Shodan’s core values is “API first”, in other words, all of their UI offerings pull data from the same API that you would be using. I’d say this is probably the same for the other data sources too, but it’s nice to know for sure!


Spyse


Spyse is a search engine that combines several different data collection tools into a “one-stop-shop” solution. The database contains a huge amount of data from all over the internet for use in reconnaissance, infrastructure scanning, pivoting with potential attack vectors and more.


A major strength of the Spyse platform is the “Any Target” mode, which when searching on a domain name will make it’s data and details of this domain available to you in a clear and easy to understand format. For example; if you were to search spiderfoot.net through Spyse domain search, it will enumerate the subdomains and give you information on the expiry date of the domain registration, DNS records, TLS/ SSL version in use and even the Alexa rank. This is quite a powerful tool in the recon stage of a pentest when gathering required information to pivot from later is necessary.


Pricing


Spyse has a free “Guest” plan, which comes with a limit of 53 filters when performing an Advanced Search and a maximum of 2 filters to be applied at any one given time. The “Standard” plan is $36 per month and comes with unlimited target lookups and 150 Advanced Search filters with a maximum of 5 concurrently applied filters. The “Pro” plan is $118 per month and comes with everything the “Standard” plan offers, plus all available Advanced search filters with a maximum of 10 concurrent filters available to apply. It also has a bulk search option available, and access to the “Scroll API”.


Data quality


The quality of the data retrieved is similar to Shodan and is a tool that can operate alongside it, although the types of data returned from these APIs vary significantly. I found that performing scans on my Shodan (professional subscription) when compared to my Spyse (standard subscription), I was able to produce roughly twice the amount of results when searching on open port numbers. While I would normally vote in favour of quality over quantity, I can’t complain about getting more data to potentially work with and easily filter through for better results.


API functionality


The format of the API is in JSON and can be exported into a CSV format, which is great for manual parsing and for use with feeding the output data into other tools. Spyse allows you to download the first 100,000 results via the API, which is great if you’re trying to download data from an enormous target asset.


Censys


Censys provides another internet data source that has many use cases including discovering internet assets, identifying risk, verifying the security of cloud configurations, and ensuring compliance. Of course, the data they provide can also be utilised to gain an understanding of an organization’s attack surface for security testing or bug bounties.


Price


For companies that want to perform more than 250 queries in a monthly period, Censys has a subscription option available from $99 to $1000 per month, depending on how many queries you plan on doing. For general use under 250 queries each month, the product is free to use.


Data quality


Using Censys together with Shodan is a popular method of performing reconnaissance for maximum inventory information. Even on its own though, Censys is quite a good tool to utilise for discovering and monitoring assets on the internet. I particularly enjoy the feature of being able to perform ongoing port scans on a target system to identify any changes, then to create alerts off this so I am notified if and when these changes are made. This is quite useful when working internally on a security team within a company to help identify potential vulnerabilities or attack surface changes as new systems are built or existing ones are modified.


API functionality


The web services API has a Python module to help improve interaction with the API. This can be utilised with the free account to perform API requests. Th API can be leveraged together with various parameters to query the databases available (such as CIDR, TLD) to help identify any information that services expose to the Internet. An example of this is below;


$ python


>>>


>>> import censys


>>> from censys import *


>>>


>>> api = censys.ipv4.CensysIPv4(api_id=”[INSERT_KEY]”, api_secret=”[INSERT_KEY]”)


>>> res = api.search(“ip:8.8.8.8”)


>>> res


{u’status’: u’ok’, u’results’: [{u’ip’: u’8.8.8.8′, u’protocols’: [u’53/dns’]}], u’metadata’: {u’count’: 1, u’query’: u’ip:8.8.8.8′, u’backend_time’: 34, u’page’: 1, u’pages’: 1}}


>>> for i in res.get(‘results’):


… print “{} {}”.format(i.get(“ip”), ” “.join(i.get(‘protocols’))



8.8.8.8 53/dns


>>>


The example above gets all running services on 8.8.8.8, one of Google’s public DNS servers. Of course, the result is an open port 53.


IntelligenceX


IntelligenceX was founded in 2018 by Peter Kleissner based in the Czech Republic and was developed to perform OSINT (Open Source Intelligence) operations. Their offering is a search engine and data archive containing all kinds of data that is pertinent to OSINT including email breaches, domain data, URLs, IP information, bitcoin addresses, CIDR ranges and more. The company boasts a strong commitment to privacy, whereby they do not store searches performed by users. They only store the data that is absolutely required to provide the service.


Price


IntelligenceX has multiple pricing options;


SMB & Enterprise


TrialProfessionalEnterprise
Free for 7 days€2000 per year€10000 to €50000 per year
1 user account1 user account5-50 user accounts
100 selector searches per day200 selector searches per day500 selector searches per day per user to unlimited
25 phone lookups per day25 phone lookups per day100 phone lookups per day per user to unlimited
10 alerts included10 alerts included100 alerts per user per day to unlimited
Fair use API accessFair use API accessOpen API access

Free Tiers


PublicFreeAcademia
No login requiredSign up requiredUniversities & Schools
10 selector searches per day50 selector searches per day100 selector searches per day
5 phone lookups per day10 phone lookups per day25 phone lookups per day
No alerts includedNo alerts included10 alerts included
No API AccessFair use API accessFair use API access

Data quality


The quality of the data available from IntelligenceX is absolutely top level. Having access through tools like Maltego, subfinder, h8mail, SpiderFoot, IntelOwl and many more makes searches with IntelligenceX an absolute go to source for OSINT Investigations. The data sources are constantly updated which leads to very accurate results, which when conducting an investigation is of the utmost importance for when you write your findings report.


API Functionality


The IntelligenceX API has a full SDK template on GitHub, with documentation on how to utilise some of the integrations as well. The languages operating in the API are Python, Go, PHP and HTML, which allows for many different types of API queries.


Other Data Sources


Below are a few data sources that did not make the main list, but they are other excellent data sources that are worth a look. Many of the sources listed below specialise in specific types of data.


  • BuiltWith – an index of domain names along with the technologies that they utilise. This is excellent for either quickly analysing what technologies are in use on a specific domain, or discovering all domains that utilise a specific technology.
  • BinaryEdge – constantly scan the internet to provide up-to-date information on ports and services exposure, possible vulnerabilities, SSL certificates and more. They also offer threat intelligence by setting up honeypots around the internet and monitoring the traffic they receive, and monitor torrents to detect if any torrenting activity occurs within your network space.
  • ViewDNS – Provides a JSON/XML API for various different network functions including whois lookups, email lookups, port scans, spam database lookups, ASN lookups, abuse contact lookups and more. Additionally, they offer a one-time download of root domains from 54 different ccTLDs, a total of nearly 80 million domains.
  • WhoisXMLAPI – Provides various APIs for querying domain data including WHOIS, registration statuses, subdomain lookups, IP geolocation, IP netblocks, reverse MX, reverse NS and more. They also have some other interesting APIs that are not necessarily DNS/WHOIS/IP related such as a “website contacts API”, “website categorization API”, “screenshot API” and “email verification API”.

Conclusion


It’s difficult to compare each of the services directly because they all have unique things to offer. Ultimately the decision will be based on the exact data points that you require, how you wish to consume them and your budget. For monitoring IP addresses, generally browsing the internet for interesting hosts or internet-facing IoT investigations we’d recommend Shodan. For asset discovery that is more DNS based and ongoing monitoring we’d recommend SecurityTrails and Censys. For investigation, post ranking and general digital marketing analysis, Spyse seems to be a good option. IntelligenceX is more appropriate for gathering information about email addresses and people, which can be an important step for red-team engagements.