Waltir
By: Waltir

Extracting data from meta tags with cheerio

Cover Image for Extracting data from meta tags with cheerio

As a QA engineer you get asked to test lots of stuff both on the front-end and back-end. One task that has come up time and time again is to verify that our meta tags are working correctly. Meta tags can be very important, especially for media companies that rely heavily on sharing their content on the various social platforms.


Getting some meta values

The following example shows how you can quickly extract some meta values from NPR.

gist:waltir/82c94c834de630f9030f95f1d8ba81cf

let test = (test: string) => {
    return test;
}

Response

After running the script above we receive the following JSON output. While it doesn’t seem like much now lets see how we can expand upon this.

[{
   "title": "National",
   "canonical": "https://www.npr.org/sections/national/",
   "description": "NPR coverage of national news, U.S. politics, elections, business, arts, culture, health and science, and technology. Subscribe to the NPR Nation RSS feed.",
   "og_title": "National",
   "og_url": "https://www.npr.org/sections/national/",
   "og_img": "https://media.npr.org/include/images/facebook-default-wide.jpg?s=1400",
   "og_type": "article",
   "twitter_site": "@NPR",
   "twitter_domain": "npr.org",
   "fb_appid": "138837436154588",
   "fb_pages": "10643211755"
}]

giphy


Getting values from multiple posts

Obviously we could get manually check the meta values on one page quite easily. Where Cheerio shines is being able to verify dozens of posts at the same time. The following script iterates over all of the posts on the page and logs their meta values to our JSON file.

gist:waltir/ddc49bfbeb82d23f197dc6b0647235d7

[{
   "title": "UNC Charlotte Shooting Victim Is Honored As A Hero For Tackling Shooter",
   "url": "https://www.npr.org/2019/05/01/719222196/unc-charlotte-shooting-victim-is-honored-as-a-hero-for-tackling-shooter",
   "canonical": "https://www.npr.org/2019/05/01/719222196/unc-charlotte-shooting-victim-is-honored-as-a-hero-for-tackling-shooter",
   "description": "Riley Howell is credited with disrupting the campus shooting, dying in the incident but saving others' lives. Police say they have not determined the shooter's motive.",
   "og_title": "UNC Charlotte Shooting Victim Is Honored As A Hero For Tackling Shooter",
   "og_url": "https://www.npr.org/2019/05/01/719222196/unc-charlotte-shooting-victim-is-honored-as-a-hero-for-tackling-shooter",
   "og_img": "https://media.npr.org/assets/img/2019/05/01/ap_19121763139817_wide-c4a4fb41a7434242650ffd548f0539a110c51b9c.jpg?s=1400",
   "og_type": "article",
   "twitter_site": "@NPR",
   "twitter_domain": "npr.org",
   "twitter_img_src": "https://media.npr.org/assets/img/2019/05/01/ap_19121763139817_wide-c4a4fb41a7434242650ffd548f0539a110c51b9c.jpg?s=1400",
   "fb_appid": "138837436154588",
   "fb_pages": "10643211755"
}, {
   "title": "Alabama Lawmakers Move To Outlaw Abortion In Challenge To Roe V. Wade",
   "url": "https://www.npr.org/2019/05/01/719096129/alabama-lawmakers-move-to-outlaw-abortion-in-challenge-to-roe-v-wade",
   "canonical": "https://www.npr.org/2019/05/01/719096129/alabama-lawmakers-move-to-outlaw-abortion-in-challenge-to-roe-v-wade",
   "description": "The House overwhelmingly passed a bill Tuesday that could become the country's most restrictive abortion ban. It would make it a crime for doctors to perform abortions at any stage of a pregnancy. ",
   "og_title": "Alabama Lawmakers Move To Outlaw Abortion In Challenge To Roe V. Wade",
   "og_url": "https://www.npr.org/2019/05/01/719096129/alabama-lawmakers-move-to-outlaw-abortion-in-challenge-to-roe-v-wade",
   "og_img": "https://media.npr.org/assets/img/2019/05/01/gettyimages-465405620_wide-4c683599c9632b335771cfa7674ffaad98cb029e.jpg?s=1400",
   "og_type": "article",
   "twitter_site": "@NPR",
   "twitter_domain": "npr.org",
   "twitter_img_src": "https://media.npr.org/assets/img/2019/05/01/gettyimages-465405620_wide-4c683599c9632b335771cfa7674ffaad98cb029e.jpg?s=1400",
   "fb_appid": "138837436154588",
   "fb_pages": "10643211755"
}]

The script above outputs to a simple JSON file, however, typically my next step is to perform a visual inspection of the scraped data in a Google Sheet. Using Cheerio we are able to quickly verify the accuracy of our meta values on dozens of posts in the same amount of time it would take to open and review just a handful of articles manually.

giphy

More Posts

Cover Image for Blocking Ad Traffic In Nightwatch JS
Blocking Ad Traffic In Nightwatch JS
Waltir
By: Waltir

Example showing how you can block unwanted ad traffic in your Nightwatch JS tests....

Cover Image for Blocking Ad Traffic In Cypress
Blocking Ad Traffic In Cypress
Waltir
By: Waltir

Example showing how you can block unwanted ad traffic in your Cypress tests....

Cover Image for Three Ways To Resize The Browser In Nightwatch
Three Ways To Resize The Browser In Nightwatch
Waltir
By: Waltir

Outlining the three different ways to resize the browser in Nightwatch JS with examples....

Cover Image for Happy Path VS Sad Path Testing
Happy Path VS Sad Path Testing
Waltir
By: Waltir

As a test engineer it is crucial that both happy path and sad path use cases have been considered and fully tested...