bigscal-logo
  • bigscal-logo
  • Services
    • Software Development
          • Software Product Development
            • SaaS Consulting
            • MVP Development
            • Startup Product Development
            • Product UI/UX Design
            • Startup Consulting
          • Information Technology Consulting
            • Agile Consulting
            • Software Consulting
            • Data Analytics Consulting
            • CRM Consulting
          • Software Outsourcing
            • IT Staff Augmentation
            • Dedicated Development Teams
            • Shadow Engineers
            • Offshore Software Development
            • Offshore Development Center
            • White Label Services
          • Custom Software Development
            • Enterprise Software Development
            • Nearshore Software Development
          • Digital Transformation
    • Application Development
          • Mobile App Development
            • React Native App Development
            • iPhone app development
            • Android App Development
            • Flutter App Development
            • Cross Platform App Development
            • Xamarin App Development
          • Web Development
            • Website & Portal Development
          • Frontend Development
            • Angular Development
            • React.js Development
            • Next.js Development Services
          • Full Stack Development
            • MEAN Stack Development
            • MERN Stack Development
          • Backend Development
            • .NET Development
            • Node js Development
            • Laravel Development
            • PHP Development
            • Python Development
            • Java Development
            • WordPress Development
            • API Development
            • SharePoint Development
          • Cloud Application Development
            • Serverless Software Development
          • Application Maintenance
          • Application Modernization
    • QA & Testing
          • Penetration Testing
          • Usability Testing
          • Integration Testing
          • Security Testing
          • Automated Testing
          • Regression Testing
          • Vulnerability Assessment
          • Functional Testing
          • Software Performance Testing
          • QA Outsourcing
          • Web Application Testing
          • Software Quality Assurance Testers
          • Mobile App Testing
          • QA Consulting
          • Application Testing
    • eCommerce
          • eCommerce Web Design
          • Ecommerce Consulting
          • Digital Consulting
          • eCommerce Web Development
          • Supply Chain Automation
          • B2C eCommerce
          • B2B Ecommerce
    • Analytics & DevOps
          • Big Data Consulting
          • Business Intelligence Consulting
          • Microsoft Power BI
          • Power BI Implementation
          • DevOps Consulting
          • Amazon AWS
          • Microsoft Azure
    • Generative AI Development Services
          • Agentic AI Services
          • AI-ML Developers
          • Hire AI Developers
          • Machine Learning Developers
          • Deep Learning Development
          • IoT Developers
          • Chatbot Developers
  • Industries
    • Education & eLearning
    • Finance
    • Transportation & Logistics
    • Healthcare
      • Hospital Management Software Development
      • Patient Management Software Development
      • Clinic Management System
      • Telemedicine App Development Solutions
      • EMR Software
      • EHR Software
      • Laboratory Information Management Systems
    • Oil and Gas
    • Real Estate
    • Retail & E-commerce
    • Travel & Tourism
    • Media & Entertainment
    • Aviation
  • Hire Developers
    • Mobile Developers
          • Hire Android App Developers
          • Hire iOS App Developers
          • Hire Swift Developers
          • Hire Xamarin Developers
          • Hire React Native Developers
          • Hire Flutter Developers
          • Hire Ionic Developers
          • Hire Kotlin Developers
    • Web Developers
          • Hire .Net Developers
            • Hire ASP.NET Core Developers
          • Hire Java Developers
            • Hire Spring Boot Developers
          • Hire Python Developers
          • Hire Ruby On Rails Developers
          • Hire Php Developers
            • Hire Laravel Developers
            • Hire Codeigniter Developer
            • Hire WordPress Developers
            • Hire Yii Developers
            • Hire Zend Framework Developers
          • Hire Graphql Developers
    • Javascript Developers
          • Hire AngularJs Developers
          • Hire Node JS Developer
          • Hire ReactJS Developer
          • Hire VueJs Developers
    • Full Stack Developers
          • Hire MEAN Stack Developer
          • Hire MERN Stack Developer
    • Blockchain & Others
          • Hire Blockchain Developers
          • Hire Devops Engineers
          • Hire Golang Developers
  • Blogs
  • Careers
  • Company
    • Our Portfolio
    • About Us
    • Contact
  • Inquire Now
  • Menu Menu
Home1 / Backend2 / A Complete Guide For Web Automation With Puppeteer In Node.JS
Steer into Web Automation with Puppeteer in node JS

A Complete Guide For Web Automation With Puppeteer In Node.JS

January 6, 2022/4 Comments/in Backend /by Nisarg Shrirao

Quick Summary: Learn Puppeteer in Node.js and learn web automation with our complete guide! As you proceed through the step-by-step instructions and real-world examples, you can unleash the potential of seamless online automation. This manual is your key to effective web automation, covering everything from installing Puppeteer to automating interactions, data scraping, and handling dynamic content. Today, utilize Puppeteer’s ability to boost your Node.js applications.

Introduction

Why are you here? Because you were searching for different ways of web automation, or mainly web automation in Node.js?

Well yes… I am right!

Therefore, the capacity to automate web interactions is not just a benefit but also a requirement in this age of technological change. Whether you are a developer, a tester, or someone seeking to simplify repetitive tasks or provide Node.js development services, I will explain how to do web automation or scraping in Puppet.

However, when people compare other technologies with Node.js, web automation in Node.js is the best option for you.

Here, I will explain web automation with clear explanations and real-world examples.

Just read on!

What is Puppeteer

Puppeteer is a Node library that provides a high-level API to control Chrome or Chromium over the DevTools Protocol. Puppeteer runs headless by default but can be configured to run full (non-headless) Chrome or Chromium.

So basically, Puppeteer is a browser you run on Node.js. It contains APIs that mimic the browser. These APIs enable you to carry out different operations.

What can we do with a puppeteer?

  • Generating PDF from a webpage.
  • Generating screenshots from a webpage.
  • Testing Chrome extensions.
  • Web Scrapping.
  • Form submission, UI testing, keyboard input, & other tasks may all be automated.
  • Access web pages & extract information using the standard DOM API.

Revolutionize your business with our custom software solutions

Let’s Start

Setup

  1. Make a folder (name it whatever).
  2. Open the folder in your terminal or command prompt.
  3. Run, npm init -y This will generate a package.json
  4. Then run npm install puppeteer This will install puppeteer which includes Chromium.

Upon installing Puppeteer, it downloads the latest version of Chromium. And it will ensure you that chromium works with API.

Usage

Now we will learn how to use puppeteer with some code examples!

Examples

Example Code #1 – Take a screenshot and save the image

Let’s start with the first example where we will be navigating to https://www.wikipedia.org/, take a screenshot of the homepage, and save it as an example.png in the same directory.

const puppeteer = require('puppeteer');
        (async () => {
        const browser = await puppeteer.launch({ headless: false });
        const page = await browser.newPage();
        await page.goto('https://www.wikipedia.org/');
        await page.screenshot(
        {
        path: 'example.png',
        type: "png",
        fullPage: true
        });
        await browser.close();
        })();

Now we create a new browser instance using the launch API in the Puppeteer class, Puppeteer.

const page = await browser.newPage()

Browsers can hold so many pages. As a result, the Browser newPage() method produces a new page in the default browser context… A page is an object of a page class.

Now, using the page object, we will load or navigate to the webpage that we want to take a screenshot of

Here, we are loading the Wikipedia home page. When the browser’s load event activates, the ‘goto’ method will resolve, indicating the successful loading of the page.

The screenshot method takes in some configurations:

Path: This indicates the file path where we want to save the image. Here, we will be saving at the current working directory.

type: Indicates the type of image encoding to use either png or jpeg.

Full Page: ­ This will stretch the screenshot to the full width of the page.

Save this code as ‘example.js’ and execute it using the command below to generate a screenshot: node example.js. This will result in the generation of a screenshot, as depicted below.

Example Code #2 – Scrape Google search and get result links

Let’s see the second example where we will be navigating to https://www.google.com, and search on google and get links from it.

const puppeteer = require("puppeteer");
        let browser;
        (async () => {
        const searchQuery = "stack overflow";
        }
        browser = await puppeteer.launch({headless: false);
        const [page] = await browser.pages();
        await page.goto("https://www.google.com/");
        await page.waitForSelector('input[aria-label="Search"]', {
        visible: true
        });
        await page.type('input[aria-label="Search"]', searchQuery);
        await Promise.all([
        page.waitForNavigation(),
        page.keyboard.press("Enter"),
        ]);
        await page.waitForSelector(".LC20lb", {
        visible: true
        });
        const searchResults = await page.evaluate(() => [...document.querySelectorAll(".LC20lb")].map(e => ({
        title: e.innerText,
        link: e.parentNode.href
        })));
        console.log(searchResults);
        })()
        .catch(err => console.error(err))
        .finally(async () => await browser.close());

page.waitForSelector (selector)

selector string A selector of an element to wait.

page.type(selector, text[, options]);

Selector: selector of an element to type into. If more than one element matches the selector, you can utilize the first one.

Text: text to type into a focused element.

Options: Object number Time to wait between key presses in milliseconds. Defaults to 0.

page.evaluate(pageFunction[, …args]) Page Function: Function to be evaluated in the page context.

..arg: Arguments to pass to page function

Now save this code as example2.js and use the below command to execute the code then you will see that a scrape link is fetched.

node example2.js and here’s the result of scrapping.

[
        {
        title: 'Stack Overflow - Where Developers Learn, Share, & Build ...',
        link: 'https://stackoverflow.com/'
        },{
        title: '',
        link: 'https://whatis.techtarget.com/definition/stack-overflow'
        }, {
        title: '',
        link: 'https://medium.com/swlh/the-best-and-worst-ways-to-use-stack-overflow-711a077f2892'
        },{
        title: '',
        link: 'https://stackoverflow.blog/2010/12/17/introducing-programmers-stackexchange-com/'
        }, {
        title: '',
        link: 'https://stackoverflow.blog/2021/03/17/stack-overflow-for-teams-is-now-free-forever-for-up-to-50-users/'
        }, {
        title: 'Stack Overflow Blog - Essays, opinions, and advice on the act ...',
        link: 'https://stackoverflow.blog/'
        },{
        title: 'Stack Overflow - Wikipedia',
        link: 'https://en.wikipedia.org/wiki/Stack_Overflow'
        },{
        title: 'Stack Overflow | LinkedIn',
        link: 'https://www.linkedin.com/company/stack-overflow'
        }, {
        title: 'Logo - Stacks',
        link: 'https://stackoverflow.design/brand/logo/'
        }, {
        title: 'Stack Overflow - Crunchbase Company Profile & Funding',
        link: 'https://www.crunchbase.com/organization/stack-overflow'
        }
        ]

Example Code #3 – Create a PDF of the page

Let’s see the third example where we will be navigating to https://www.wikipedia.org/, and make Pdf and save it as exaple3 .pdf in the same directory.

const puppeteer = require('puppeteer');
        (async () => {
        const browser = await puppeteer.launch({headless:false,
        pipe: true, 
        args: ['--headless', '--disable-gpu',
        '--no-sandbox', '--disable-setuid-sandbox', '--disable-dev-shm-usage'],
        });
        
        const page = await browser.newPage();
        await page.goto('https://www.wikipedia.org/', {
        waitUntil: 'networkidle2',
        });
        
        await page.pdf({ path: 'example3.pdf', format: 'a4' });
        await browser.close();
        }) ();

Here are several options you can employ for the ‘pdf()’ method:

print background: When this option is true, Puppeteer prints any background colors or images that you have use on the web page to the PDF
path: Path specifies where to save the generated PDF file. You can also store it into a memory stream to avoid writing to disk.

format: You can set the PDF format to one of the given options: Letter, A4, A3, A2, etc.

margin: You can specify a margin for the generated PDF with this option.

Now save this code as example3.js and use the below command to execute the code (then you will see that a PDF is generated)

https://drive.google.com/file/d/14yToS3Fd7jKxYOQCIRASkJGJSxgbVxzr/view

Now you have some idea how it works for more you can refer:- https://pptr.dev/

Get your copy of the ultimate guide to web automation with Puppeteer in Node.JS

Conclusion

Now that the name of the library is known, Node.js is an application. Through its scripting capabilities, it is possible to automate everything over the web. A wise way to reboot your e-Commerce might be using BigScal opportunities. BigScans also propagates the possibilities of Puppeteer by availing a platform which is user friendly and helps in managing automation at the large scale. Also, BigScal employs convenient script management from end-to-end distributive operations down to intelligent analytics to make the automation processes simpler and enable you to tackle complex jobs.

FAQ

Can we use Xpath in Puppeteer?

For yes, XPath can fit in with Puppeteer in the search for webpage elements. But Puppeteer only provides the means of locating the elements by means of CSS selectors, it also has the page. Create a custom function by the name $x() to execute XPath queries. This way you can walk around the DOM as well as interacting with elements through xpath declaration syntax. However, the use of CSS selectors is advised if for performance reason and compatibility unless a particular situation demands XPath.

What is Puppeteer in automation?

Puppeteer is a Node. its api gives the highest level of abstraction for the automation of browser actions through javascript. It is an open-source library developed by Google. To put it simply, it enables developers to manage a programmable version of the Chrome or Chromium browsers. Puppeteer is popular among Web Developers for web scraping, UI testing. It also generates images, and any other task necessary to interact with webpages. It delivers for developers the powerful mechanisms for scenario building and extracting data from websites.

What is XPath in Puppeteer?

XPath via in Puppeteer means using XPath statements to get and control the elements on a webpage. Most of “.puppeteer’s” functionality is based on CSS selectors, but it also affords the ability to traverse the elements of the page. x() function for the purpose of XPath queries execution.

Which is better: playwright or Puppeteer?

Playwright and Puppeteer are two browser automation libraries. However, Playwright is superior game to other browsers. The browser from the playwright (as it is not built on chromium, firefox or webkit). It will be a lot faster and has features for more robust automation Wherein the Puppeteer is easy and widely available. Playwright has a higher level of abstraction that allows it to cover more complex use-cases better. The decision as to which one to use mainly is influenced by the project needs level, the browser compatibly and the features expectancy.

Does Puppeteer work in Docker?

Correct, Puppeteer does get along in a Docker container. However, with the help of Puppeteer tests might still become success in the case of applications.

Tags: #bigscal #puppeteer #nodejs #chrome #webautomation #scrapping

Seeking robust and scalable software solutions?

Contact us for industry-leading development services.

Book a 30 min FREE Call

Craft your Best Agile Team

Your Project, Our Expertise - Hire a Developer Now

    Subscribe for
    weekly updates

      privacy-policy I accept the terms and conditions

      Categories

      • AI-ML-Blockchain
      • Aviation
      • Backend
      • Cloud
      • Cross Platform
      • Cyber Security
      • Database
      • DevOps
      • Digital Marketing
      • Ecommerce
      • Education Industry
      • Entertainment Industry
      • Fintech Industries
      • Frontend
      • Full Stack
      • Game Development
      • Healthcare Industry
      • Latest Technology News
      • Logistics Industry
      • Mobile app development
      • Oil And Gas Industry
      • Plugins and Extensions
      • QA & Testing
      • Real Estate Industry
      • SaaS
      • Software Development
      • Top and best Company
      • Travel industries
      • UI UX
      • Website Development

      Table of Content

      bigscal-technology
      india
      1st Floor, B - Millenium Point,
      Opp. Gabani Kidney Hospital,
      Lal Darwaja Station Rd,
      Surat – 395003, Gujarat, INDIA.
      us
      1915, 447 Broadway,
      2nd Floor, New York,
      US, 10013
      +91 7862861254
      [email protected]

      • About
      • Career
      • Blog
      • Terms & Conditions
      • Privacy Policy
      • Sitemap
      • Contact Us
      Google reviews
      DMCA.com Protection Status
      GoodFirms Badge
      clutch-widget
      © Copyright - Bigscal - Software Development Company
      Google reviews
      DMCA.com Protection Status
      GoodFirms Badge
      clutch-widget

      Stay With Us

      Are you looking for the perfect partner for your next software project?

      Google reviews GoodFirms Badge clutch-widget
      • IP Rights, Security & NDA. Full ownership and confidentiality with robust security guaranteed.
      • Flexible Contracts & Transparency. Tailored contracts with clear and flexible processes.
      • Free Trial & Quick Setup. No-risk trial and swift onboarding process.

        How to setup Node.Js with MongoDB using Docker Setup Nod JS and MongoDB on docker Master Cloudflare and Azure DevOps deployments How to Deploy Website With Cloudflare Workers And Azure DevOps CI/CD
        Scroll to top

        We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it.

        AcceptHide notification onlySettings

        Cookie and Privacy Settings



        How we use cookies

        We may request cookies to be set on your device. We use cookies to let us know when you visit our websites, how you interact with us, to enrich your user experience, and to customize your relationship with our website.

        Click on the different category headings to find out more. You can also change some of your preferences. Note that blocking some types of cookies may impact your experience on our websites and the services we are able to offer.

        Essential Website Cookies

        These cookies are strictly necessary to provide you with services available through our website and to use some of its features.

        Because these cookies are strictly necessary to deliver the website, refuseing them will have impact how our site functions. You always can block or delete cookies by changing your browser settings and force blocking all cookies on this website. But this will always prompt you to accept/refuse cookies when revisiting our site.

        We fully respect if you want to refuse cookies but to avoid asking you again and again kindly allow us to store a cookie for that. You are free to opt out any time or opt in for other cookies to get a better experience. If you refuse cookies we will remove all set cookies in our domain.

        We provide you with a list of stored cookies on your computer in our domain so you can check what we stored. Due to security reasons we are not able to show or modify cookies from other domains. You can check these in your browser security settings.

        Other external services

        We also use different external services like Google Webfonts, Google Maps, and external Video providers. Since these providers may collect personal data like your IP address we allow you to block them here. Please be aware that this might heavily reduce the functionality and appearance of our site. Changes will take effect once you reload the page.

        Google Webfont Settings:

        Google Map Settings:

        Google reCaptcha Settings:

        Vimeo and Youtube video embeds:

        Privacy Policy

        You can read about our cookies and privacy settings in detail on our Privacy Policy Page.

        Privacy Policy
        Accept settingsHide notification only