GitHub - HAYVENO/anonymous-scraper

Project Overview

The project provides code examples for the anonymous web scraping using Node.js article. It covers two tiers of scraping techniques:

Tier 1: Scraping static websites with Cheerio and user-agent rotation.
Tier 2: Scraping dynamic websites using the Incogniton API alongside puppeteer, including pagination handling.

It also includes Incogniton fingerprint trustworthiness tests using tools like IPHey, FingerprintPro, and SannySoft, as well as a non-headless scraping mode option so you can see how the automation process in action.

Getting Started

To get started with this project, follow these steps:

Clone the repository:

   git clone https://github.com/HAYVENO/anonymous-scraper.git

Navigate to the project directory:

   cd anonymous-scraper

Install the necessary dependencies:

Ensure you have Node.js installed. Then, run:

   npm install

Run the scraper files:

Each scraper file can be executed using Node.js. For example, to run anon-scraper2.js, use:

   node anon-scraper2.js

For files located inside a folder, such as the Tests, navigate to the tests directory and run the desired test file using Node.js. For example, to run test-file.js, use:

   cd tests
   node test-file.js

Or you can run the test file directly from the root folder:

   node tests/test-file.js

Replace test-file.js or anon-scraper2.js with the specific file name you wish to execute.

Note: Before running the scripts, review the code and any associated configuration files to ensure they are set up correctly for your target websites and comply with their terms of service.

Fingerprint Trustworthiness Tests

Test	Description
IPHey	Analyzes your browser's digital identity to determine its trustworthiness.
FingerprintPro	Provides browser fingerprinting demo to identify users even when you using Incognito mode.
SannySoft	Evaluates whether a browser is being controlled by automation tools like Puppeteer or Selenium.

For more information and guidance, refer to the original article associated with this repository.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
Tests		Tests
.eslintrc		.eslintrc
.gitignore		.gitignore
.prettierrc		.prettierrc
LICENSE		LICENSE
README.md		README.md
anon-scraper-non-headless.js		anon-scraper-non-headless.js
anon-scraper1.js		anon-scraper1.js
anon-scraper2.js		anon-scraper2.js
export-starter.js		export-starter.js
exportData.js		exportData.js
index.js		index.js
package-lock.json		package-lock.json
package.json		package.json
scraped-data.csv		scraped-data.csv
scraped-data.json		scraped-data.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

Project Overview

Getting Started

Fingerprint Trustworthiness Tests

About

Uh oh!

Releases

Packages

Languages

Uh oh!

License

Uh oh!

HAYVENO/anonymous-scraper

Folders and files

Latest commit

History

Repository files navigation

Project Overview

Getting Started

Fingerprint Trustworthiness Tests

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages