Microsoft Teams Logo

Publish content to your screens directly from Microsoft Teams

Learn More ScreenCloud Banner Arrow
ScreenCloud Close Icon
    Product

    Digital Signage

    What is Digital Signage?What is ScreenCloud?HardwareAppsProduct UpdatesSecurityDashboards

    Partnerships

    ChromeOSMicrosoft
    Solutions

    Case Studies

    Recomended case study thumbnailHow Ricoh UK Products Limited Uses ScreenCloud to Improve Efficiency and Enable a Well-Informed and Connected Workforce
    Show all case studiesScreenCloud arrow
    Pricing
    Resources
    Login
Get DemoFree Trial
PricingLogin

Digital Signage

What is Digital Signage?What is ScreenCloud?HardwareAppsProduct UpdatesSecurityDashboards

Partnerships

ChromeOSMicrosoft

Industries

ManufacturingFood & BeveragesRetailEducationHealthcareFitness, Leisure & Culture

Use Case

Employee Facing ScreensScreen Network Management

Case Studies

Recomended case study thumbnailHow Ricoh UK Products Limited Uses ScreenCloud to Improve Efficiency and Enable a Well-Informed and Connected WorkforceShow all case studiesScreenCloud arrow
Get DemoFree Trial
Resources
>

Web Scraping 101: How to Copy Anything for Digital Signage

Scores. Stats. Summaries. You can scrape anything online—in a website or an app— then put it on your ScreenCloud signage. Here’s how.

ScreenCloud Post Image

It’s never been easier to put together top-notch digital signage with software like ScreenCloud. But if you want to go beyond the basic news, weather and traffic information, coding new dashboards or copying and pasting information from around the web onto your signage can get tedious and time-consuming.

The better way is to use data scraping to gather data at scale for a fraction of the effort. You’ll make your screens more engaging, whether they’re for customers in a waiting room or for your team’s break room. Gather the top trending keywords, live sales data, and customer support team progress to keep everyone’s finger on the pulse, without adding more hours to your work week. Here’s how, with ScreenCloud plus popular scraping tools.

What is website scraping?

Scraping web content is copying data from a web page and storing it for later use. You can manually scrape data by copying and pasting, but that’s tedious and time consuming. The better way is to hire a robot and get scraping software to copy data for you. It’s faster and more accurate—and can automatically re-scrape data anytime anything changes, without any extra work on your part.

Typically, you’ll first open a website in the scraping tool, select what you want to extract, and then have the software export it in a CSV file or push it to an API or Webhooks URL to use elsewhere. You can then take your scraped data and use it anywhere.

Data scraping—in a way—is how Google Search was built, how ChatGPT’s AI was trained, and how many companies leverage business intelligence to gather data about competitors to improve decision-making. And it’s just as useful for grabbing the latest headlines and team performance numbers for your company’s signage. It’s workplace automation that can work for your team.

There are four main methods to scrape data from websites:

  • Using a web scraping app
  • Using Google Sheets or Excel to scrape HTML/XML data
  • Scraping data via API (if the target website has one)
  • Coding your own data scraper using Python

Here’s how to use each of those—and how to get your scraped data onto your ScreenCloud signage.

Option 1: Use a web scraping app

Let’s start with the easiest way to scrape data from a website: A web scraping app. These tools help you extract data without having to code or understand how websites work. To set  up a scraping job, you’ll add a URL, select the data you want to collect, and either save it to a CSV file or push it to another app once the data is collected. Some of these tools employ AI to get around tricky websites, making them more adaptable to website changes over time.

Scrape widgets and other graphical elements for digital signage with ScreenCloud Dashboards
ScreenCloud Dashboards can copy a widget or section from any site, and put it on your signage.

ScreenCloud includes a basic scraping app built for digital signage: ScreenCloud Dashboards. It’s perfect to capture a portion of a webpage—a graph from Tableau or PowerBI, a table from Jira or Trello, a dashboard from Notion or Airtable—and display it on your signage. And it works much like other scraping tools. You enter a website address, select the bit of the app you’d like to copy, and ScreenCloud will show a live, updated version of it on your signage.

That works great when you want to show an entire widget, graph, or section of a site or app. If you only want a more specific bit of data—a single value, a headline sentence, or a table’s worth of data to sort and filter—you’ll need a dedicated data scraping app. Here’s a shortlist of 3 solid apps you can use for data scraping:

  • Octoparse is the robust choice, trusted by enterprises—and starts at $75/month. It can run on your local computer or in the cloud, with a no-code interface that lets you select the data you want to extract, and workflows to collect everything you need from a site. You can see the expected end result as a table at the bottom of the screen, so you don’t have to run multiple attempts to get what you want. It has a learning curve due to its flexibility and advanced scraping settings—but its advanced tools means it can solve CAPATCHAs for you and keep your scraper from getting blocked. And it has integrations and Webhook support to send your scraped data to your signage..
  • Browse.ai is more beginner-friendly, offering ways to monitor URLs and scrape any updates with a $19/month Starter plan. It lets you configure robots that can access dynamic list pages to extract product lists or hotel pricing, for example. The data can then be sent to Google Sheets for an easy way to display it on your signage. 
  • Webscraper.io is the free option, at least if you’re scraping locally using your own browser. Install its browser extension and start scraping data right away. The data is stored in a CSV, XLSX or JSON file, ready to upload to your favorite spreadsheet or database app to then pass along to your digital signage. For cloud automation, advanced scraping features and API access, the pricing starts at $50 per month.
Scrape text from a website with Browse.ai
Load a site in Browse.ai, and its bot will let you pick any bits of text from the website to monitor and copy automatically.

Let’s set up an automation with Browse.ai to see how it works with ScreenCloud signage. Create an account, and select to monitor a site for changes (you could also have it extract data from a table, if you want). Then enter the website you want to monitor, add Browse.ai’s Chrome extension if you haven’t already, then click Start Recording.

Browse.ai will load your site with a robot icon in the top right. Sign in as normal, find the page you need, then click the robot, choose to select text, and pick the item on the site you want to monitor.

Screenshot of Browse.ai automated web scraping settings
Set a schedule in Browse.ai to check your site or app for changes.

Finally, save and set up a schedule, and every time Browse.ai notices a change in that text on the site it’s monitoring, it’ll let you know. And you can do more from the Integrations tab. There, you can have Browse.AI push the updated data to another app via API or Webhooks, get it to save data to Google Sheets, or send it to automation platforms like Zapier.

Screenshot of ScreenCloud Playgrounds dashboard
Use ScreenCloud Playgrounds to build an auto-updating dashboard with your scraped data

That same general workflow works with all of the scraping apps recommended. Whenever your app scrapes something new, it can send it to Zapier—which then can push it along to your newly auto-updating ScreenCloud dashboard.

Option 2: Use Excel or Google Sheets to scrape HTML data

Spreadsheets were the original killer app on the earliest personal computers—and it’s surprising how often spreadsheets remain the best tool for so many tasks after all these years. They’ve gained more advanced functions over the years, including tools to scrape HTML or XML data from a website using the IMPORT formula family. Add URLs to one column, the formula to another, and you’ll get your data scraped in minutes.

This is a free, simple way to scrape easy-to-access data from sites that don’t require a login. More advanced workflows will still need a full scraping tool, but for the more basic tasks spreadsheets can cover the basics. And once you’ve scraped data with a spreadsheet, you can format it then display it on ScreenCloud with the Google Sheets integration or an Excel Online embed.

Here are how to use the core spreadsheet web scraping functions:

IMPORTHTML

Scrape an HTML table with Google Sheets and the IMPORTHTML function
A Wikipedia table, parsed into Google Sheets in seconds

Use IMPORTHTML when you need to extract a list or table from a website’s HTML:

=IMPORTHTML(url, query, index)

  • Start by adding the URL of the target page. Be sure to enclose it in quotation marks.
  • Then, fill out the query. This can only be one of two options: use “table” for extracting data from tables and “lists” for, you guessed it, extracting lists. Enclose this in quotation marks too.
  • Finally, the index refers to which table or list you want. If the page has more than one table, count starting from the top left and write that number in the formula. If you don’t get the desired table, increase or decrease that number until you get the right one.

Here’s an example where we’re extracting a list of all documented helicopter escape attempts from prisons. Open the link on your browser to see the page, and then paste the formula below in Google Sheets to see what happens:

=IMPORTHTML("https://en.wikipedia.org/wiki/List_of_helicopter_prison_escapes", "table", 2)

The list on the Wikipedia page should now be displayed inside your spreadsheet.

Tips to keep in mind:

  • Be sure to include http:// or https:// at the beginning of every URL.
  • Don’t forget the quotation marks for the URL and query.
  • Separate each argument (url, query, index) using a comma.
  • In Google Sheets, make sure there’s space below and in front of the cell where you’re writing the formula for best results.

IMPORTXML

Scrape links and other HTML and XML data with the Google Sheets IMPORTXML function
Copy every URL from a website in a second in Google Sheets

The IMPORTXML formula is more versatile but slightly harder to use. Despite its name, you can use it to extract data not only from XML files but also from HTML and CSV files—which means you can copy data from almost any website. To use it, write the following:

=IMPORTXML(url, xpath_query)

Start by filling the URL argument with the target page. Then, fill in the xpath_query with the identifier of the data you want to extract. You have to input a path to the data you want, like how you input a path in your file explorer to reach a folder inside your computer. Here’s the syntax guide to help you put together your query.

To discover the path to the data you want to extract on the web page:

  • Open your browser and navigate to the page you’re scraping.
  • On your keyboard, use the CTRL + SHIFT + C (CMD + SHIFT + C on a Mac) keyboard shortcut to open the developer tools and start the element inspector. Alternatively, you can right-click the content and then click on Inspect.
  • Click on the content that you want to extract.
  • Notice that the page’s HTML on the developer tools pane highlights the piece of code that’s displaying that element. This is what we’re looking for. Write down the HTML tag that’s marking the content you want to extract on the xpath_query argument. Consider also noting down any IDs and classes within that tag, as these can be used to further identify the data that you want to extract.

Here’s an example, where we’re extracting all the links from the list of micronations from Wikipedia:

=IMPORTXML("https://en.wikipedia.org/wiki/List_of_micronations", "//@href")

Tips to keep in mind:

  • Remember to add quotation marks around the URL and the XPath query
  • Build your own query by exploring the XPath query syntax in this WC3 guide.

IMPORTFEED

Scrape an RSS or podcast feed with the Google Sheets IMPORTFEED function
Google Sheets can scrape RSS and Podcast feeds, too.

Remember RSS feeds? They’re great to keep up with updates on a website, and are still the only way to publish podcasts. You can import RSS feeds and their latest information using this formula in Google Sheets:

=IMPORTFEED(url, query, headers, number_of_items)

This formula only needs the URL to work. All the other arguments are optional. Here’s what they do:

  • query lets you select what information you want to grab, for example just the name or the date
  • headers can be set to TRUE to grab the header row of the feed, labeling columns appropriately
  • number_of_items lets you select how many rows you want to retrieve. Adding 3 as an argument would retrieve the 3 first rows of the feed.

Here’s the New York Times’ The Daily podcast RSS feed, grabbing the summary of each of the first 5 most recent episodes:

=IMPORTFEED("https://feeds.simplecast.com/54nAGcIl", "items summary", TRUE, 5)

Tips to keep in mind:

  • Don’t forget quotes around the URL and query.
  • The headers and number arguments don’t need quotes around them.
  • If you want to set the number of items but don’t want to enter a query or get the headers, remember to fill each argument with blanks by using a double set of empty quotation marks. For example, =IMPORTFEED("https://feeds.simplecast.com/54nAGcIl", "", "", 1) will retrieve only the first row of the feed with an empty query and no header.

And, pro-tip: ScreenCloud supports RSS feeds natively, for another way to get live-updated data on your signage without needing to scrape it first.

Option 3: Scrape data via API

In some cases, the website you’re visiting may offer an API: an application programming interface. APIs can be used to build third-party services on top of the data and functionality of an app or website. For example, by accessing the Facebook API, you could build a social media scheduling app by pushing and pulling data to and from the platform. And, since APIs let you read data from a platform or website, you can use them to scrape data.

The upside of this method is that the end result is much more accurate than other options on the list. You’ll be accessing data sourced directly from the platform’s databases. There are some drawbacks, though:

  • It’s up to the website’s owners to decide what kinds of data they’ll offer via API, so there may be cases where you can’t access all the data you need. There’s also the possibility that some API endpoints require you to create an account or pay a subscription to access.
  • The data output is formatted in JSON (JavaScript Object Notation), a format that makes data easier to read for machines (and somewhat challenging for humans). You’ll have to convert the JSON later to an easily-readable format or to a table you can store in a CSV file, for example.
  • Each platform sets up its APIs in their own way, depending on the types of data or services they provide. This means that each API requires a slightly different strategy as you’re calling and sending queries to it. This requires extra patience and troubleshooting skills.
  • There are APIs that are very well documented, making them easy to use. Others don’t have good documentation, which means you’ll have to invest time in testing queries and exploring the data on your own.

Calling APIs using your browser

While the most powerful and flexible way to call APIs and handle output is using code or a platform like Postman, we’ll explore a simpler way using your browser. We’ll be using it to:

  • call an API with a query, similar to how we visit a website
  • see the output from that call
  • and export that output into a JSON file

The first step before starting any API call is to study the documentation: Start a Google search by writing the name of the website you’re scraping and add “+ API documentation”. This information tells you what you can do with the API, which endpoints are available (think of these as different doors that provide separate services or types of data), and what’s the process to make requests.

In this example, we’ll be scraping data from Data USA (documentation here). It’s a free API that doesn’t require authentication, so anyone can call it and see the results immediately. Copy the following example API call to a new tab in your browser and press ENTER to visit it.

https://datausa.io/api/data?drilldowns=Nation&measures=Population

Depending on your browser, the data will be formatted in a different way. Chrome presents the output in raw format, which is the hardest format to read. Firefox formats the data in a more readable format, also letting you interact with the JSON output.

Before moving on, let’s break this request into pieces:

  • https://datausa.io is the website we’re contacting
  • /api/data is the API endpoint
  • everything after the ? sign are key-value pairs. These are used to build queries, letting us extract the data we want from the database. The format of these is always key=value. Each key-value pair is separated by an & sign, like so: key=value&key=value

In this example, we have the drilldowns set to Nation and the measures set to Population, which will extract the total population from the selected nation (the US). But this API lets us explore other kinds of data. Try changing the drilldowns to State. Copy the following example to your browser tab or change the appropriate value and press ENTER.

https://datausa.io/api/data?drilldowns=State&measures=Population

Now, instead of seeing the total population for the entire United States, we’re seeing a breakdown by state. As you explore the API documentation and the data structure, you can build queries with additional key-value pairs, letting you dig deeper into the data and extract the exact lists you want.

To take this data with you, hit CTRL + S or (CMD + S for Mac) to save the data as a JSON file. You can then head to ConvertCSV to convert that file to CSV. And once you have a CSV file in your hands, you can import it into Google Sheets or Excel. You can also import this data into ScreenCloud, letting you send it to all your screens effortlessly.

Option 4: Code a data scraper in Python

Scrape text from a website with Python code and export it to a CSV spreadsheet
A table of quotes parsed with Python code and saved into a spreadsheet

If you have programming skills, you can code your own data scraper with Python (it’s also possible to do it with JavaScript, but we’re going with Python as it’s slightly more accessible). 

The advantages here are that you can build a scraper that fits your exact needs and the unique challenges of each web page. The downsides include the potentially long time to reach a functioning solution, and the fact that the code may need to be updated frequently if the target website changes on a regular basis.

Here’s the big-picture view of what your code must do:

  • First, you start by contacting the page’s URL and extracting the HTML
  • Then, you parse that HTML using the BeautifulSoup library
  • From the parsed HTML, you separate and store the data you want to extract in variables using methods such .findAll()
  • Once you have all the data you need, you proceed to write it to a CSV file using a loop, and finish by closing the file
  • To use the data you just scraped, you can import it to Google Sheets or Excel to embed in your ScreenCloud signage..

Before starting, remember that you need to install the Python interpreter and either use Terminal or an IDE to write and execute your code. Both PyCharm and Thonny are good options if you don’t already have a favorite one.

With the Python interpreter and IDE ready to go, here’s the list of libraries you’ll need:

  • requests handles the process of contacting the URLs and extracting the website’s HTML
  • BeautifulSoup parses HTML into a format that’s readable by Python
  • csv creates CSV spreadsheet files from your data, so you can later import them to Google Sheets or Excel.

Install those dependencies, then here’s a Python code template to get started with simple scraping jobs:

# Importing all the required libraries
from bs4 import BeautifulSoup
import requests
import csv
from itertools import zip_longest
# Connecting to the website and parsing the HTML
scrape_target = requests.get("http://quotes.toscrape.com")
soup = BeautifulSoup(scrape_target.text, "html.parser")
# Creating variables to store the data and selecting data from the HTML
quotes = soup.findAll("span", attrs={"class": "text"})
authors = soup.findAll("small", attrs={"class": "author"})
# Preparing a CSV file
with open("quotes.csv", "w", newline="") as file:
  writer = csv.writer(file)
  # Writing each row into the CSV file
  for quote, author in zip_longest(quotes, authors):
    quote_text = quote.text if quote else ""
    author_name = author.text if author else ""
    writer.writerow([quote_text, author_name])
    print(quote_text + " - " + author_name)
print("Data has been written to quotes.csv.")
# The CSV file created by running this code will appear in the same directory where you saved the script.
# You can import it into Google Sheets or Excel and manipulate the data there.

This code is enough for simple, static websites. For others that require authentication, offer dynamic content or have advanced anti-scraping measures, you’ll have to adapt your code to account for those characteristics. Still, the structure should always be the same: Grabbing the data, selecting what you need from it, and setting up a loop to write it to a CSV file.

Can I get in trouble for scraping data from an app or website?

In general, data scraping lies in a bit of a gray area. If the website you’re scraping relies heavily on data for monetization and for providing its main services, it’s likely that data scraping is frowned upon. Be sure to do a Google search for the website name and add “+ terms of use” to see if there’s any explicit guidance regarding data scraping. You might also notice some sites blocking your scraping bot if it scrapes too frequently, or might be asked to complete a CAPTCHA—something the more premium scraping tools can solve for you.

Generally, though, if you’re scraping data that’s publicly visible—or especially that’s inside your company’s paid software accounts—you should be fine scraping the data and using it on your signage.

Can I scrape data from a site that requires a login?

Yes, but that depends on which tool you’re using. Web scraping apps are good for this use case, since you can configure them to log in to apps and websites and then run their data scraping routine.

For coded solutions, you have to find out what’s the authentication flow of the target website. Do they use a separate login server that you need to send credentials to? Is it session- or token-based authentication? You’ll have to recreate the process of logging in, contacting all the appropriate servers, filling all the login forms and storing all the tokens to secure the necessary privileges to access the data.

Building digital signage with any data you want

Now it’s time to start building more advanced digital signage with any data you want to share with your team.

Want to copy full widgets, graphs, and formatted text from your apps? ScreenCloud’s built-in Dashboards is perfect for that.

Want to copy raw text and show it on your custom ScreenCloud Playgrounds screens or embed it with spreadsheet integrations? Scraping bots or spreadsheet functions are your best call.

Think through the types of data you’d like to share with your team, and where it might be hidden. Then build the bots and parsers that would work best for your team, and start building the best automated dashboards with pull live data from as many apps and sites as you want.

And if you don’t already have digital signage in your company, sign up for a free ScreenCloud trial and start building your very first signage, scraping the data you need to share with your team right from the start.

Image Credit: Header Photo by ThisisEngineering RAEng via Unsplash

 SC Gradient

Ready to get your organization connected?

Connect your first screen today with our 14-day free trial

Free TrialBook Demo
articles