
Extracting data from meta tags with cheerio
As a QA engineer you get asked to test lots of stuff both on the front-end and back-end. One task that has come up time and time again is to verify that our meta tags are working correctly. Meta tags can be very important, especially for media companies that rely heavily on sharing their content on the various social platforms.
Getting some meta values
The following example shows how you can quickly extract some meta values from NPR.
gist:waltir/82c94c834de630f9030f95f1d8ba81cf
let test = (test: string) => {
return test;
}
Response
After running the script above we receive the following JSON output. While it doesn’t seem like much now lets see how we can expand upon this.
[{
"title": "National",
"canonical": "https://www.npr.org/sections/national/",
"description": "NPR coverage of national news, U.S. politics, elections, business, arts, culture, health and science, and technology. Subscribe to the NPR Nation RSS feed.",
"og_title": "National",
"og_url": "https://www.npr.org/sections/national/",
"og_img": "https://media.npr.org/include/images/facebook-default-wide.jpg?s=1400",
"og_type": "article",
"twitter_site": "@NPR",
"twitter_domain": "npr.org",
"fb_appid": "138837436154588",
"fb_pages": "10643211755"
}]
Getting values from multiple posts
Obviously we could get manually check the meta values on one page quite easily. Where Cheerio shines is being able to verify dozens of posts at the same time. The following script iterates over all of the posts on the page and logs their meta values to our JSON file.
gist:waltir/ddc49bfbeb82d23f197dc6b0647235d7
[{
"title": "UNC Charlotte Shooting Victim Is Honored As A Hero For Tackling Shooter",
"url": "https://www.npr.org/2019/05/01/719222196/unc-charlotte-shooting-victim-is-honored-as-a-hero-for-tackling-shooter",
"canonical": "https://www.npr.org/2019/05/01/719222196/unc-charlotte-shooting-victim-is-honored-as-a-hero-for-tackling-shooter",
"description": "Riley Howell is credited with disrupting the campus shooting, dying in the incident but saving others' lives. Police say they have not determined the shooter's motive.",
"og_title": "UNC Charlotte Shooting Victim Is Honored As A Hero For Tackling Shooter",
"og_url": "https://www.npr.org/2019/05/01/719222196/unc-charlotte-shooting-victim-is-honored-as-a-hero-for-tackling-shooter",
"og_img": "https://media.npr.org/assets/img/2019/05/01/ap_19121763139817_wide-c4a4fb41a7434242650ffd548f0539a110c51b9c.jpg?s=1400",
"og_type": "article",
"twitter_site": "@NPR",
"twitter_domain": "npr.org",
"twitter_img_src": "https://media.npr.org/assets/img/2019/05/01/ap_19121763139817_wide-c4a4fb41a7434242650ffd548f0539a110c51b9c.jpg?s=1400",
"fb_appid": "138837436154588",
"fb_pages": "10643211755"
}, {
"title": "Alabama Lawmakers Move To Outlaw Abortion In Challenge To Roe V. Wade",
"url": "https://www.npr.org/2019/05/01/719096129/alabama-lawmakers-move-to-outlaw-abortion-in-challenge-to-roe-v-wade",
"canonical": "https://www.npr.org/2019/05/01/719096129/alabama-lawmakers-move-to-outlaw-abortion-in-challenge-to-roe-v-wade",
"description": "The House overwhelmingly passed a bill Tuesday that could become the country's most restrictive abortion ban. It would make it a crime for doctors to perform abortions at any stage of a pregnancy. ",
"og_title": "Alabama Lawmakers Move To Outlaw Abortion In Challenge To Roe V. Wade",
"og_url": "https://www.npr.org/2019/05/01/719096129/alabama-lawmakers-move-to-outlaw-abortion-in-challenge-to-roe-v-wade",
"og_img": "https://media.npr.org/assets/img/2019/05/01/gettyimages-465405620_wide-4c683599c9632b335771cfa7674ffaad98cb029e.jpg?s=1400",
"og_type": "article",
"twitter_site": "@NPR",
"twitter_domain": "npr.org",
"twitter_img_src": "https://media.npr.org/assets/img/2019/05/01/gettyimages-465405620_wide-4c683599c9632b335771cfa7674ffaad98cb029e.jpg?s=1400",
"fb_appid": "138837436154588",
"fb_pages": "10643211755"
}]
The script above outputs to a simple JSON file, however, typically my next step is to perform a visual inspection of the scraped data in a Google Sheet. Using Cheerio we are able to quickly verify the accuracy of our meta values on dozens of posts in the same amount of time it would take to open and review just a handful of articles manually.
More Posts
Blocking Ad Traffic In Nightwatch JS

Example showing how you can block unwanted ad traffic in your Nightwatch JS tests....
Blocking Ad Traffic In Cypress

Example showing how you can block unwanted ad traffic in your Cypress tests....
Three Ways To Resize The Browser In Nightwatch

Outlining the three different ways to resize the browser in Nightwatch JS with examples....
Happy Path VS Sad Path Testing

As a test engineer it is crucial that both happy path and sad path use cases have been considered and fully tested...