Incredibly fast crawler designed for OSINT.
Photon can extract the following data while crawling:
The extracted information is saved in an organized manner or can be exported as json.
Control timeout, delay, add seeds, exclude URLs matching a regex pattern and other cool stuff. The extensive range of options provided by Photon lets you crawl the web exactly the way you want.
Photon's smart thread management & refined logic gives you top notch performance.
Still, crawling can be resource intensive but Photon has some tricks up it's sleeves. You can fetch URLs archived by archive.org to be used as seeds by using
Photon can be launched using a lightweight Python-Alpine (103 MB) Docker image.
$ git clone https://github.com/s0md3v/Photon.git $ cd Photon $ docker build -t photon . $ docker run -it --name photon photon:latest -u google.com
To view results, you can either head over to the local docker volume, which you can find by running
docker inspect photonor by mounting the target loot folder:
$ docker run -it --name photon -v "$PWD:/Photon/google.com" photon:latest -u google.com
Photon is under heavy development and updates for fixing bugs. optimizing performance & new features are being rolled regularly.
If you would like to see features and issues that are being worked on, you can do that on Development project board.
Updates can be installed & checked for with the
--updateoption. Photon has seamless update capabilities which means you can update Photon without losing any of your saved data.
You can contribute in following ways:
Please read the guidelines before submitting a pull request or issue.
Do you want to have a conversation in private? Hit me up on my twitter, inbox is open :)
Photon is licensed under GPL v3.0 license